wordart

The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.

https://github.com/xdxie/wordart

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Keywords

artistic-text-recognition contrastive-learning corner dataset ocr text-recognition transformer

Last synced: 6 months ago · JSON representation ·

Repository

The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.

Basic Info

Host: GitHub
Owner: xdxie
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 9.58 MB

Statistics

Stars: 140
Watchers: 5
Forks: 16
Open Issues: 4
Releases: 0

Topics

artistic-text-recognition contrastive-learning corner dataset ocr text-recognition transformer

Created over 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme License Citation

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (ECCV 2022 Oral)

The official code of CornerTransformer (ECCV 2022, Oral).

This work focuses on a new challenging task of artistic text recognition. To tackle the difficulties of this task, we introduce the corner point map as a robust representation for the artistic text image and present the corner-query cross-attention mechanism to make the model achieve more accurate attention. We also design a character contrastive loss to learn the invariant features of characters, leading to tight clustering of features. In order to benchmark the performance of different models, we provide the WordArt dataset.

Runtime Environment

This repo depends on PyTorch, MMCV, MMDetection and MMOCR. Below are quick steps for installation. Please refer to MMOCR 0.6 Install Guide for more detailed instruction.

shell conda create -n wordart python=3.7 -y conda activate wordart conda install pytorch==1.10 torchvision cudatoolkit=11.3 -c pytorch pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html pip install mmdet git clone https://github.com/xdxie/WordArt.git cd WordArt pip install -r requirements.txt pip install -v -e . export PYTHONPATH=$(pwd):$PYTHONPATH pip install -r requirements/albu.txt

WordArt Dataset

The WordArt dataset consists of 6316 artistic text images with 4805 training images and 1511 testing images. The dataset is available at Google Drive.

Preparing Datasets

Please follow the steps in MMOCR 0.6 Dataset Zoo to prepare the text recognition datasets. Put all the datasets in data/mixture folder. In this repository, we use two synthetic datasets MJSynth and SynthText to train the model. We evaluate the model performance on IIIT5k, IC13, SVT, IC15, SVTP, CUTE, and our proposed WordArt.

Note: Please make sure to reprocess the two training datasets following the steps.

Training

For distributed training on multiple GPUs, please use shell ./tools/dist_train.sh ${CONFIG_FILE} ${WORK_DIR} ${GPU_NUM} [PY_ARGS] For training on a single GPU, please use shell python tools/train.py ${CONFIG_FILE} [ARGS] For example, we use this script to train the model: shell ./tools/dist_train.sh configs/textrecog/corner_transformer/corner_transformer_academic.py outputs/corner_transformer/ 4

Evaluation

For distributed evaluating on multiple GPUs, please use shell ./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [PY_ARGS] For evaluating on a single GPU, please use shell python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [ARGS] For example, we use this script to evaluate the model performance: shell CUDA_VISIBLE_DEVICES=0 python tools/test.py outputs/corner_transformer/corner_transformer_academic.py outputs/corner_transformer/latest.pth --eval acc

Results

|Method|IC13|SVT|IIIT|IC15|SVTP|CUTE|WordArt|download| |-|-|-|-|-|-|-|-|-| |CornerTransformer|96.4|94.6|95.9|86.3|91.5|92.0|70.8|model|

Visualization

Each example is along with the results from ABINet-LV, our baseline and the proposed CornerTransformer. Hard examples are successfully recognized by CornerTransformer.

When decorative patterns from the background have exactly the same appearance and similar shape as the texts, CornerTransformer may fail to achieve correct results. Each image is along with our result and the ground truth.

Citation

Please cite the following paper when using the WordArt dataset or this repo.

@article{xie2022toward, title={Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition}, author={Xie, Xudong and Fu, Ling and Zhang, Zhifei and Wang, Zhaowen and Bai, Xiang}, booktitle={ECCV}, year={2022} }

Acknowledgement

This repo is based on MMOCR 0.6. We appreciate this wonderful open-source toolbox.

Owner

Name: Xudong Xie
Login: xdxie
Kind: user
Location: Wuhan, China
Company: Huazhong University of Science and Technology

Repositories: 2
Profile: https://github.com/xdxie

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "OpenMMLab Text Detection, Recognition and Understanding Toolbox"
authors:
  - name: "MMOCR Contributors"
version: 0.3.0
date-released: 2020-08-15
repository-code: "https://github.com/open-mmlab/mmocr"
license: Apache-2.0

GitHub Events

Total

Watch event: 7
Issue comment event: 1
Fork event: 1

Last Year

Watch event: 7
Issue comment event: 1
Fork event: 1

Dependencies

docs/en/requirements.txt pypi

recommonmark *
sphinx *
sphinx_markdown_tables *
sphinx_rtd_theme *

mmocr.egg-info/requires.txt pypi

asynctest *
codecov *
flake8 *
imgaug *
isort *
kwarray *
lanms-neo ==1.0.2
lmdb *
matplotlib *
mmcv-full >=1.3.8
mmdet >=2.21.0
numpy *
opencv-python *
pyclipper *
pycocotools *
pytest *
pytest-cov *
pytest-runner *
rapidfuzz >=2.0.0
scikit-image *
torch >=1.1
ubelt *
xdoctest >=0.10.0
yapf *

requirements/albu.txt pypi

albumentations >=1.1.0

requirements/build.txt pypi

numpy *
pyclipper *
torch >=1.1

requirements/docs.txt pypi

docutils ==0.16.0
markdown <3.4.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.3.8
mmdet >=2.21.0

requirements/readthedocs.txt pypi

imgaug *
kwarray *
lanms-neo ==1.0.2
lmdb *
matplotlib *
mmcv *
mmdet *
pyclipper *
rapidfuzz >=2.0.0
regex *
scikit-image *
scipy *
shapely *
titlecase *
torch *
torchvision *

requirements/runtime.txt pypi

imgaug *
lanms-neo ==1.0.2
lmdb *
matplotlib *
numpy *
opencv-python >=4.2.0.32,
pyclipper *
pycocotools *
rapidfuzz >=2.0.0
scikit-image *

requirements/tests.txt pypi

asynctest * test
codecov * test
flake8 * test
isort * test
kwarray * test
pytest * test
pytest-cov * test
pytest-runner * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

.circleci/docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

requirements/optional.txt pypi

requirements.txt pypi

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science