multi-domain-ocr
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: Jiayou-Chao
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 516 KB
Statistics
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
Multi-domain-OCR
The project is the codebase for the paper "Multi-domain OCR with Meta Self-Learning" (https://arxiv.org/abs/2401.00971). The code is based on MMOCR.
Installation
The environment setup is the same as MMOCR. Alternatively you can use the setup.sh script to install the environment:
```bash
(Optional) Create a conda environment
conda create -n multi-domain-ocr python=3.10 -y conda activate multi-domain-ocr
Set up the environment
bash setup.sh ```
Prepare Dataset
The dataset used for evaluation is the open-sourced dataset MSDA (Multi-source domain adaptation dataset for text recognition). Please refer to the homepage for the download link. The dataset is in the format of tar file. Please extract the file and then use tools/dataset_converters/textrecog/lmdb_converter.py to convert the dataset to lmdb format. Assume the dataset is stored in data/Meta-SelfLearning and to be extracted to data/cache/, the following is an example of converting the syn dataset to lmdb format:
```bash
Extract the dataset
mkdir -p data/cache/ tar -xvf data/Meta-SelfLearning/syn/test_imgs.tar -C data/cache/
Convert the dataset to lmdb format
python tools/datasetconverters/textrecog/lmdbconverter.py data/Meta-SelfLearning/syn/testlabel.txt data/Meta-SelfLearning/LMDB/syn/testimgs.lmdb -i data/cache/Meta-SelfLearning/root/data/TextRecognitionDatasets/IMG/syn/test_imgs/ --label-format txt ```
Training
The training script is in tools/train.py. The following is an example of training the model on the syn dataset:
bash
python tools/train.py configs/path/to/config.py
If you want to use multiple GPUs for training, use tools/dist_train.sh:
bash
tools/dist_train.sh configs/configs/path/to/config.py 8 --auto-scale-lr --amp
The config files to reproduce the results in the paper are in configs/. The following is an example of training the backbone on the docu dataset:
bash
tools/dist_train.sh configs/textrecog/adapter/backbone_docu.py 8 --auto-scale-lr --amp
The following is an example of training the adapter on the syn dataset:
bash
tools/dist_train.sh configs/textrecog/adapter/adapter_docu_adapter_syn.py 8 --auto-scale-lr --amp
Owner
- Login: Jiayou-Chao
- Kind: user
- Repositories: 3
- Profile: https://github.com/Jiayou-Chao
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." title: "OpenMMLab Text Detection, Recognition and Understanding Toolbox" authors: - name: "MMOCR Contributors" version: 0.3.0 date-released: 2020-08-15 repository-code: "https://github.com/open-mmlab/mmocr" license: Apache-2.0
GitHub Events
Total
Last Year
Dependencies
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- albumentations >=1.1.0
- numpy *
- pyclipper *
- torch >=1.1
- docutils ==0.16.0
- markdown >=3.4.0
- myst-parser *
- sphinx ==4.0.2
- sphinx-tabs *
- sphinx_copybutton *
- sphinx_markdown_tables >=0.0.16
- tabulate *
- mmcv >=2.0.0rc4,<2.1.0
- mmdet >=3.0.0rc5,<3.1.0
- mmengine >=0.7.0,<1.0.0
- imgaug *
- kwarray *
- lmdb *
- matplotlib *
- mmcv >=2.0.0rc1
- mmdet >=3.0.0rc0
- mmengine >=0.1.0
- pyclipper *
- rapidfuzz >=2.0.0
- regex *
- scikit-image *
- scipy *
- shapely *
- titlecase *
- torch *
- torchvision *
- imgaug *
- lmdb *
- matplotlib *
- numpy *
- opencv-python >=4.2.0.32,
- pyclipper *
- pycocotools *
- rapidfuzz >=2.0.0
- scikit-image *
- asynctest * test
- codecov * test
- flake8 * test
- interrogate * test
- isort * test
- kwarray * test
- lanms-neo ==1.0.2 test
- parameterized * test
- pytest * test
- pytest-cov * test
- pytest-runner * test
- ubelt * test
- xdoctest >=0.10.0 test
- yapf * test