https://github.com/ai-forever/ocr-model

An easy-to-run OCR model pipeline based on CRNN and CTC loss

https://github.com/ai-forever/ocr-model

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

crnn ocr pytorch text-recognition
Last synced: 5 months ago · JSON representation

Repository

An easy-to-run OCR model pipeline based on CRNN and CTC loss

Basic Info
  • Host: GitHub
  • Owner: ai-forever
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 93.8 KB
Statistics
  • Stars: 46
  • Watchers: 3
  • Forks: 20
  • Open Issues: 0
  • Releases: 0
Topics
crnn ocr pytorch text-recognition
Created over 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme License

README.md

OCR model

This is a model for Optical Character Recognition based on CRNN-arhitecture and CTC loss.

OCR-model is a part of ReadingPipeline repo.

Demo

In the demo you can find an example of using of OCR-model (you can run it in your Google Colab).

Quick setup and start

The provided Dockerfile is supplied to build an image with CUDA support and cuDNN.

Preparations

  • Clone the repo.
  • Download and extract dataset to the data/ folder.
  • sudo make all to build a docker image and create a container. Or sudo make all GPUS=device=0 CPUS=10 if you want to specify gpu devices and limit CPU-resources.

If you don't want to use Docker, you can install dependencies via requirements.txt

Configuring the model

You can change the ocr_config.json and set the necessary training and evaluating parameters: alphabet, image size, saving path, etc.

"train": { "datasets": [ { "csv_path": "/workdir/data/dataset_1/train.csv", "prob": 0.5 }, { "csv_path": "/workdir/data/dataset_2/train.csv", "prob": 0.7 }, ... ], "epoch_size": 10000, "batch_size": 512 } - epoch_size - the size of an epoch. If you set it to null, then the epoch size will be equal to the amount of samples in the all datasets. - It is also possible to specify several datasets for the train/validation/test, setting the probabilities for each dataset separately (the sum of prob can be greater than 1, since normalization occurs inside the processing).

Prepare data

Datasets must be pre-processed and have a single format: each dataset must contain a folder with images (crop images with text) and csv file with annotations. The csv file should contain two columns: "filename" with the relative path to the images (folder-name/image-name.png), and "text"-column with the image transcription.

| filename | text | | ----------------- | ---- | | images/4099-0.png | is |

If you use polygon annotations in COCO format, you can prepare a training dataset using this script:

bash python scripts/prepare_dataset.py \ --annotation_json_path path/to/the/annotaions.json \ --annotation_image_root dir/to/images/from/annotation/file \ --class_names pupil_text pupil_comment teacher_comment \ --bbox_scale_x 1 \ --bbox_scale_y 1 \ --save_dir dir/to/save/dataset \ --output_csv_name data.csv

Training

To train the model:

bash python scripts/train.py --config_path path/to/the/ocr_config.json

Evaluating

To test the model:

bash python scripts/evaluate.py \ --config_path path/to/the/ocr_config.json \ --model_path path/to/the/model-weights.ckpt

If you want to use a beam search decoder with LM, you can pass lmpath arg with path to .arpa kenLM file. --lmpath path/to/the/language-model.arpa

ONNX

You can convert Torch model to ONNX to speed up inference on cpu.

bash python scripts/torch2onnx.py \ --config_path path/to/the/ocr_config.json \ --model_path path/to/the/model-weights.ckpt

Owner

  • Name: AI Forever
  • Login: ai-forever
  • Kind: organization
  • Location: Armenia

Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.

GitHub Events

Total
  • Watch event: 3
  • Fork event: 1
Last Year
  • Watch event: 3
  • Fork event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 77
  • Total Committers: 2
  • Avg Commits per committer: 38.5
  • Development Distribution Score (DDS): 0.104
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Stanislav Kalinin k****n@s****m 69
Julia132 u****2@m****u 8
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 4
  • Total pull requests: 8
  • Average time to close issues: 13 days
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • dhirendracode (1)
  • BelowzeroA (1)
  • timespace42 (1)
Pull Request Authors
  • Julia132 (8)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • Pillow ==8.4.0
  • albumentations ==1.1.0
  • matplotlib ==3.5.0
  • numpy ==1.21.4
  • opencv-python ==4.6.0.66
  • pandas ==1.3.4
  • pudb ==2021.1
  • scikit-learn ==1.0.1
  • scipy ==1.4.1
  • torch >=1.6.0
  • tqdm ==4.62.3
Dockerfile docker
  • nvidia/cuda 11.4.0-devel-ubuntu20.04 build
setup.py pypi