https://github.com/ai-forever/readingpipeline
Text reading pipeline that combines segmentation and OCR-models.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Keywords
Repository
Text reading pipeline that combines segmentation and OCR-models.
Basic Info
Statistics
- Stars: 26
- Watchers: 2
- Forks: 7
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Reading Pipeline
This is a pipeline for text detection and reading. It combines the OCR and Segmentation models into the single pipeline and allows to segment an input image, then crop text regions from it and, finally, read these texts using OCR.
Demo
A web demo (on hugging face) of ReadingPipeline for the Peter the Great dataset and web demo for recognition school notebook dataset.
Also there is a demo with an example of using the ReadingPipeline (you can run it in your Google Colab).
Models
Weights for reading manuscripts of Peter the Great, and Peter dataset
Weights for reading school notebooks handwritten dataset, and school notebook datasets itself: RU data and EN data
Quick setup and start
- Nvidia drivers >= 470, CUDA >= 11.4
- Docker, nvidia-docker
The provided Dockerfile is supplied to build an image with CUDA support and cuDNN.
Preparations
- Clone the repo.
- Download weights and config-files of segmentation and OCR models to the
data/folder. sudo make allto build a docker image and create a container. Orsudo make all GPUS=device=0 CPUS=10if you want to specify gpu devices and limit CPU-resources.
If you don't want to use Docker, you can install dependencies via requirements.txt
Configuring the pipeline
You can change parameters of the pipeline in the pipeline_config.json.
Main pipeline loop
The main_process-dict defines the order of the main processing methods that make up the pipeline loop. Classes are initialized with the parameters specified in the config, and are called one after the other in the predefined order.
PipelinePredictor - the class responsible for assembling the pipeline, and is located in ocrpipeline/predictor.py. To add a new class to the pipeline, you need to add it to the MAIN_PROCESS_DICT dictionary in ocrpipeline/predictor.py and also specify it in the main_process-dict in the config at the point in the chain from which the class should be called.
"main_process": {
"SegmPrediction": {...},
"RestoreImageAngle": {...},
"ClassContourPosptrocess": {...},
"OCRPrediction": {...},
"LineFinder": {...},
...
}
Models runtime, ONNX
You can specify runtime method for OCR and segmentation models.
"main_process": {
"SegmPrediction": {
"model_path": "/path/to/model.ckpt",
"config_path": "/path/to/config.json",
"num_threads": 8,
"device": "cuda",
"runtime": "Pytorch" # here you can chose runtime method
},
...
}
You can chose runtime method from several options: "Pytorch" (cuda and cpu devices), "ONNX" (only cpu is allowed) or "OpenVino" (only cpu).
Class specific parameters
Parameters in the classes-dict are set individually for each class. The names of the classes must correspond to the class names of the segmentation model.
The contour_posprocess-dict defines the order of the contour processing, predicted by the segmentation model. Classes are initialized with the parameters specified in the config, and are called one after the other in the predefined order.
ClassContourPosptrocess is the class responsible for assembling and calling contour_posptrocess methods, and is located in ocrpipeline/predictor.py. To add a new class to the pipeline, you need to add it to the CONTOUR_PROCESS_DICT dictionary in ocrpipeline/predictor.py and also specify it in the contour_posprocess-dict in the config at the point in the chain from which the class should be called.
"classes": {
"shrinked_pupil_text": {
"contour_posptrocess": {
"BboxFromContour": {},
"UpscaleBbox": {"upscale_bbox": [1.4, 2.3]}
}
},
...
}
Inference
An example of model inference can be found in inference_pipeline.ipynb.
To evaluate the pipeline accuracy (the OCR-model combined with the SEGM-model), you can use evaluate script (you first need to generate model predictions, an example in inferencepipelineon_dataset.ipynb).
Owner
- Name: AI Forever
- Login: ai-forever
- Kind: organization
- Location: Armenia
- Repositories: 60
- Profile: https://github.com/ai-forever
Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Pillow ==8.4.0
- matplotlib ==3.5.0
- numpy ==1.21.4
- opencv-python ==4.6.0.66
- pudb ==2021.1
- pyclipper ==1.3.0
- scikit-learn ==1.0.1
- shapely ==1.8.0
- torch >=1.6.0
- tqdm ==4.62.3
- nvidia/cuda 11.4.0-devel-ubuntu20.04 build