https://github.com/ai-forever/ocr-model
An easy-to-run OCR model pipeline based on CRNN and CTC loss
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Keywords
Repository
An easy-to-run OCR model pipeline based on CRNN and CTC loss
Basic Info
Statistics
- Stars: 46
- Watchers: 3
- Forks: 20
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
OCR model
This is a model for Optical Character Recognition based on CRNN-arhitecture and CTC loss.
OCR-model is a part of ReadingPipeline repo.
Demo
In the demo you can find an example of using of OCR-model (you can run it in your Google Colab).
Quick setup and start
- Nvidia drivers >= 470, CUDA >= 11.4
- Docker, nvidia-docker
The provided Dockerfile is supplied to build an image with CUDA support and cuDNN.
Preparations
- Clone the repo.
- Download and extract dataset to the
data/folder. sudo make allto build a docker image and create a container. Orsudo make all GPUS=device=0 CPUS=10if you want to specify gpu devices and limit CPU-resources.
If you don't want to use Docker, you can install dependencies via requirements.txt
Configuring the model
You can change the ocr_config.json and set the necessary training and evaluating parameters: alphabet, image size, saving path, etc.
"train": {
"datasets": [
{
"csv_path": "/workdir/data/dataset_1/train.csv",
"prob": 0.5
},
{
"csv_path": "/workdir/data/dataset_2/train.csv",
"prob": 0.7
},
...
],
"epoch_size": 10000,
"batch_size": 512
}
- epoch_size - the size of an epoch. If you set it to null, then the epoch size will be equal to the amount of samples in the all datasets.
- It is also possible to specify several datasets for the train/validation/test, setting the probabilities for each dataset separately (the sum of prob can be greater than 1, since normalization occurs inside the processing).
Prepare data
Datasets must be pre-processed and have a single format: each dataset must contain a folder with images (crop images with text) and csv file with annotations. The csv file should contain two columns: "filename" with the relative path to the images (folder-name/image-name.png), and "text"-column with the image transcription.
| filename | text | | ----------------- | ---- | | images/4099-0.png | is |
If you use polygon annotations in COCO format, you can prepare a training dataset using this script:
bash
python scripts/prepare_dataset.py \
--annotation_json_path path/to/the/annotaions.json \
--annotation_image_root dir/to/images/from/annotation/file \
--class_names pupil_text pupil_comment teacher_comment \
--bbox_scale_x 1 \
--bbox_scale_y 1 \
--save_dir dir/to/save/dataset \
--output_csv_name data.csv
Training
To train the model:
bash
python scripts/train.py --config_path path/to/the/ocr_config.json
Evaluating
To test the model:
bash
python scripts/evaluate.py \
--config_path path/to/the/ocr_config.json \
--model_path path/to/the/model-weights.ckpt
If you want to use a beam search decoder with LM, you can pass lmpath arg with path to .arpa kenLM file. --lmpath path/to/the/language-model.arpa
ONNX
You can convert Torch model to ONNX to speed up inference on cpu.
bash
python scripts/torch2onnx.py \
--config_path path/to/the/ocr_config.json \
--model_path path/to/the/model-weights.ckpt
Owner
- Name: AI Forever
- Login: ai-forever
- Kind: organization
- Location: Armenia
- Repositories: 60
- Profile: https://github.com/ai-forever
Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.
GitHub Events
Total
- Watch event: 3
- Fork event: 1
Last Year
- Watch event: 3
- Fork event: 1
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Stanislav Kalinin | k****n@s****m | 69 |
| Julia132 | u****2@m****u | 8 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 4
- Total pull requests: 8
- Average time to close issues: 13 days
- Average time to close pull requests: about 2 hours
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- dhirendracode (1)
- BelowzeroA (1)
- timespace42 (1)
Pull Request Authors
- Julia132 (8)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Pillow ==8.4.0
- albumentations ==1.1.0
- matplotlib ==3.5.0
- numpy ==1.21.4
- opencv-python ==4.6.0.66
- pandas ==1.3.4
- pudb ==2021.1
- scikit-learn ==1.0.1
- scipy ==1.4.1
- torch >=1.6.0
- tqdm ==4.62.3
- nvidia/cuda 11.4.0-devel-ubuntu20.04 build