Recent Releases of mmocr

mmocr - MMOCR Release v1.0.1

We are thrilled to announce the release of MMOCR v1.0.1! This version contains important bug fixes and feature enhancements.

🆕 New Features

Scheduler visualization from mmpretrain is now added to MMOCR thanks to @A-new-b #1866
AWS S3 obtainer support @EnableAsync https://github.com/open-mmlab/mmocr/pull/1888

🛠️ Bug Fixes

Fixed TypeError bug by @frankstorming #1868
Updated IIIT5K md5 by @gaotongxiao #1848
Fixed some Chinese display problems by @KevinNuNu #1922
Updated branches in the continuous integration (CI) setup by @gaotongxiao #1842

📝 Documentation Improvements

Removed version tab from the documentation by @gaotongxiao #1843
Updated data preparation guide by @Harold-lkk #1784
Updated English version of dataset_preparer.md by @Lum1104 #1860

New Contributors

@frankstorming made their first contribution in https://github.com/open-mmlab/mmocr/pull/1868
@A-new-b made their first contribution in https://github.com/open-mmlab/mmocr/pull/1866
@Lum1104 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1860
@ly015 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1944

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0...v1.0.1

- Python
Published by gaotongxiao almost 3 years ago

mmocr - MMOCR Release v1.0.0

We are excited to announce the first official release of MMOCR 1.0, with many enhancements, bug fixes, and the introduction of new dataset support!

🌟 Highlights

Support for SCUT-CTW1500, SynthText, and MJSynth datasets
Updated FAQ and documentation
Deprecation of fileclientargs in favor of backend_args
Added a new MMOCR tutorial notebook

🆕 New Features & Enhancement

Add SCUT-CTW1500 by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1677
Cherry Pick #1205 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1774
Make lanms-neo optional by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1772
SynthText by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1779
Deprecate fileclientargs and use backend_args instead by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1765
MJSynth by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1791
Add MMOCR tutorial notebook by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1771
decouple batchsize to detbatchsize, recbatchsize and kiebatch_size in MMOCRInferencer by @hugotong6425 in https://github.com/open-mmlab/mmocr/pull/1801
Accepts local-rank in train.py and test.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1806
update stitchboxesinto_lines by @cherryjm in https://github.com/open-mmlab/mmocr/pull/1824
Add tests for pytorch 2.0 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1836

📝 Docs

FAQ by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1773
Remove LoadImageFromLMDB from docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1767
Mark projects in docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1766
add opendatalab download link by @jorie-peng in https://github.com/open-mmlab/mmocr/pull/1753
Fix some deadlinks in the docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1469
Fix quick run by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1775
Dataset by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1782
Update faq by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1817
more social network links by @fengshiwest in https://github.com/open-mmlab/mmocr/pull/1818
Update docs after branch switching by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1834

🛠️ Bug Fixes:

Place dicts to .mim by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1781
Test svtrsmall instead of svtrtiny by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1786
Add pse weight to metafile by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1787
Synthtext metafile by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1788
Clear up some unused scripts by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1798
if dst not exists, when move a single file may raise a file not exists error. by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1803
CTW1500 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1814
MJSynth & SynthText Dataset Preparer config by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1805
Use poly_intersection instead of poly.intersection to avoid sup… by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1811
Abinet: fix ValueError: Blur limit must be odd when centered=True. Got: (3, 6) by @hugotong6425 in https://github.com/open-mmlab/mmocr/pull/1821
Bug generated during kie inference visualization by @Yangget in https://github.com/open-mmlab/mmocr/pull/1830
Revert sync bn in inferencer by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1832
Fix mmdet digit version by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1840

🎉 New Contributors

@jorie-peng made their first contribution in https://github.com/open-mmlab/mmocr/pull/1753
@hugotong6425 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1801
@fengshiwest made their first contribution in https://github.com/open-mmlab/mmocr/pull/1818
@cherryjm made their first contribution in https://github.com/open-mmlab/mmocr/pull/1824
@Yangget made their first contribution in https://github.com/open-mmlab/mmocr/pull/1830

Thank you to all the contributors for making this release possible! We're excited about the new features and enhancements in this version, and we're looking forward to your feedback and continued support. Happy coding! 🚀

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc6...v1.0.0

- Python
Published by gaotongxiao about 3 years ago

mmocr - MMOCR Release v1.0.0rc6

Highlights

Two new models, ABCNet v2 (inference only) and SPTS are added to projects/ folder.
Announcing Inferencer, a unified inference interface in OpenMMLab for everyone's easy access and quick inference with all the pre-trained weights. Docs
Users can use test-time augmentation for text recognition tasks. Docs
Support batch augmentation through BatchAugSampler, which is a technique used in SPTS.
Dataset Preparer has been refactored to allow more flexible configurations. Besides, users are now able to prepare text recognition datasets in LMDB formats. Docs
Some textspotting datasets have been revised to enhance the correctness and consistency with the common practice.
Potential spurious warnings from shapely have been eliminated.

Dependency

This version requires MMEngine >= 0.6.0, MMCV >= 2.0.0rc4 and MMDet >= 3.0.0rc5.

New Features & Enhancements

Discard deprecated lmdb dataset format and only support img+label now by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1681
abcnetv2 inference by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1657
Add RepeatAugSampler by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1678
SPTS by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1696
Refactor Inferencers by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1608
Dynamic return type for rescale_polygons by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1702
Revise upstream version limit by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1703
TextRecogCropConverter add crop with opencv warpPersepective function by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1667
change cudnn benchmark to false by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1705
Add ST-pretrained DB-series models and logs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1635
Only keep meta and state_dict when publish model by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1729
Rec TTA by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1401
Speedup formatting by replacing np.transpose with torch… by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1719
Support auto import modules from registry. by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1731
Support batch visualization & dumping in Inferencer by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1722
add a new argument font_properties to set a specific font file in order to draw Chinese characters properly by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1709
Refactor data converter and gather by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1707
Support batch augmentation through BatchAugSampler by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1757
Put all registry into registry.py by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1760
train by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1756
configs for regression benchmark by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1755
Support lmdb format in Dataset Preparer by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1762

Docs

update the link of DBNet by @AllentDan in https://github.com/open-mmlab/mmocr/pull/1672
Add notice for default branch switching by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1693
docs: Add twitter discord medium youtube link by @vansin in https://github.com/open-mmlab/mmocr/pull/1724
Remove unsupported datasets in docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1670

Bug Fixes

Update dockerfile by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1671
Explicitly create np object array for compatibility by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1691
Fix a minor error in docstring by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1685
Fix lint by @triple-Mu in https://github.com/open-mmlab/mmocr/pull/1694
Fix LoadOCRAnnotation ut by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1695
Fix isort pre-commit error by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1697
Update owners by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1699
Detect intersection before using shapley.intersection to eliminate spurious warnings by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1710
Fix some inferencer bugs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1706
Fix textocr ignore flag by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1712
Add missing softmax in ASTER forward_test by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1718
Fix head in readme by @vansin in https://github.com/open-mmlab/mmocr/pull/1727
Fix some browse dataset script bugs and draw textdet gt instance with ignore flags by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1701
icdar textrecog ann parser skip data with ignore flag by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1708
beziertopolygon -> bezier2polygon by @double22a in https://github.com/open-mmlab/mmocr/pull/1739
Fix docs recog CharMetric P/R error definition by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1740
Remove outdated resources in demo/ by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1747
Fix wrong ic13 textspotting split data; add lexicons to ic13, ic15 and totaltext by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1758
SPTS readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1761

New Contributors

@triple-Mu made their first contribution in https://github.com/open-mmlab/mmocr/pull/1694
@double22a made their first contribution in https://github.com/open-mmlab/mmocr/pull/1739

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc5...v1.0.0rc6

- Python
Published by gaotongxiao about 3 years ago

mmocr - MMOCR Release v1.0.0rc5

Highlights

Two models, Aster and SVTR, are added to our model zoo. The full implementation of ABCNet is also available now.
Dataset Preparer supports 5 more datasets: CocoTextV2, FUNSD, TextOCR, NAF, SROIE.
We have 4 more text recognition transforms, and two helper transforms. See https://github.com/open-mmlab/mmocr/pull/1646 https://github.com/open-mmlab/mmocr/pull/1632 https://github.com/open-mmlab/mmocr/pull/1645 for details.
The transform, FixInvalidPolygon, is getting smarter at dealing with invalid polygons, and now capable of handling more weird annotations. As a result, a complete training cycle on TotalText dataset can be performed bug-free. The weights of DBNet and FCENet pretrained on TotalText are also released.

New Features & Enhancements

Update ic15 det config according to DataPrepare by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1617
Refactor icdardataset metainfo to lowercase. by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1620
Add ASTER Encoder by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1239
Add ASTER decoder by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1625
Add ASTER config by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1238
Update ASTER config by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1629
Support browse_dataset.py to visualize original dataset by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1503
Add CocoTextv2 to dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1514
Add Funsd to dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1550
Add TextOCR to Dataset Preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1543
Refine example projects and readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1628
Enhance FixInvalidPolygon, add RemoveIgnored transform by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1632
ConditionApply by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1646
Add NAF to dataset preparer by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1609
Add SROIE to dataset preparer by @FerryHuang in https://github.com/open-mmlab/mmocr/pull/1639
Add svtr decoder by @willpat1213 in https://github.com/open-mmlab/mmocr/pull/1448
Add missing unit tests by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1651
Add svtr encoder by @willpat1213 in https://github.com/open-mmlab/mmocr/pull/1483
ABCNet train by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1610
Totaltext cfgs for DB and FCE by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1633
Add Aliases to models by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1611
SVTR transforms by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1645
Add SVTR framework and configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1621
Issue Template by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1663

Docs

Add Chinese translation for browse_dataset.py by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1647
updata abcnet doc by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1658
update the dbnetpp`s readme file by @zhuyue66 in https://github.com/open-mmlab/mmocr/pull/1626

Bug Fixes

nn.SmoothL1Loss beta can not be zero in PyTorch 1.13 version by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1616
ctc loss bug if target is empty by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1618
Add torch 1.13 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1619
Remove outdated tutorial link by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1627
Dev 1.x some doc mistakes by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1630
Support custom font to visualize some languages (e.g. Korean) by @ProtossDragoon in https://github.com/open-mmlab/mmocr/pull/1567
dbmoduleloss，negative number encountered in sqrt by @KevinNuNu in https://github.com/open-mmlab/mmocr/pull/1640
Use int instead of np.int by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1636
Remove support for py3.6 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1660

New Contributors

@zhuyue66 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1626
@KevinNuNu made their first contribution in https://github.com/open-mmlab/mmocr/pull/1630
@FerryHuang made their first contribution in https://github.com/open-mmlab/mmocr/pull/1639
@willpat1213 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1448

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc4...v1.0.0rc5

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v1.0.0rc4

Highlights

Dataset Preparer can automatically generate base dataset configs at the end of the preparation process, and supports 6 more datasets: IIIT5k, CUTE80, ICDAR2013, ICDAR2015, SVT, SVTP.
Introducing our projects/ folder - implementing new models and features into OpenMMLab's algorithm libraries has long been complained to be troublesome due to the rigorous requirements on code quality, which could hinder the fast iteration of SOTA models and might discourage community members from sharing their latest outcome here. We now introduce projects/ folder, where some experimental features, frameworks and models can be placed, only needed to satisfy the minimum requirement on the code quality. Everyone is welcome to post their implementation of any great ideas in this folder! We also add the first example project to illustrate what we expect a good project to have (check out the raw content of README.md for more info!).
Inside the projects/ folder, we are releasing the preview version of ABCNet, which is the first implementation of text spotting models in MMOCR. It's inference-only now, but the full implementation will be available very soon.

New Features & Enhancements

Add SVT to dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1521
Polish bbox2poly by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1532
Add SVTP to dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1523
Iiit5k converter by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1530
Add cute80 to dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1522
Add IC13 preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1531
Add 'Projects/' folder, and the first example project by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1524
Rename to {dataset-name}_task_train/test by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1541
Add print_config.py to the tools by @IncludeMathH in https://github.com/open-mmlab/mmocr/pull/1547
Add get_md5 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1553
Add config generator by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1552
Support IC15_1811 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1556
Update CT80 config by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1555
Add config generators to all textdet and textrecog configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1560
Refactor TPS by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1240
Add TextSpottingConfigGenerator by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1561
Add common typing by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1596
Update textrecog config and readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1597
Support head loss or postprocessor is None for only infer by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1594
Textspotting datasample by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1593
Simplify mono_gather by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1588
ABCNet v1 infer by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1598

Docs

Add Chinese Guidance on How to Add New Datasets to Dataset Preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1506
Update the qq group link by @vansin in https://github.com/open-mmlab/mmocr/pull/1569
Collapse some sections; update logo url by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1571
Update dataset preparer (CN) by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1591

Bug Fixes

Fix two bugs in dataset preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1513
Register bug of CLIPResNet by @jyshee in https://github.com/open-mmlab/mmocr/pull/1517
Being more conservative on Dataset Preparer by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1520
python -m pip upgrade in windows by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1525
Fix wildreceipt metafile by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1528
Fix Dataset Preparer Extract by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1527
Fix ICDARTxtParser by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1529
Fix Dataset Zoo Script by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1533
Fix crop without padding and recog metainfo delete unuse info by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1526
Automatically create nonexistent directory for base configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1535
Change mmcv.dump to mmengine.dump by @ProtossDragoon in https://github.com/open-mmlab/mmocr/pull/1540
mmocr.utils.typing -> mmocr.utils.typing_utils by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1538
Wildreceipt tests by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1546
Fix judge exist dir by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1542
Fix IC13 textdet config by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1563
Fix IC13 textrecog annotations by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1568
Auto scale lr by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1584
Fix icdar data parse for text containing separator by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1587
Fix textspotting ut by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1599
Fix TextSpottingConfigGenerator and TextSpottingDataConverter by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1604
Keep E2E Inferencer output simple by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1559

New Contributors

@jyshee made their first contribution in https://github.com/open-mmlab/mmocr/pull/1517
@ProtossDragoon made their first contribution in https://github.com/open-mmlab/mmocr/pull/1540
@IncludeMathH made their first contribution in https://github.com/open-mmlab/mmocr/pull/1547

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc3...v1.0.0rc4

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v0.6.3

Highlights

This release enhances the inference script and fixes a bug that might cause failure on TorchServe.

Besides, a new backbone, oCLIP-ResNet, and a dataset preparation tool, Dataset Preparer, have been released in MMOCR 1.0.0rc3 (1.x branch). Check out the changelog for more information about the features, and maintenance plan for how we will maintain MMOCR in the future.

New Features & Enhancements

Convert numpy.float32 type to python built-in float type by @JunYao1020 in https://github.com/open-mmlab/mmocr/pull/1462
When '.' char not in output string, output is also considered to be a… by @JunYao1020 in https://github.com/open-mmlab/mmocr/pull/1457
Refactor issue template by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1449
issue template by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1489
Update maintainers by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1504
Support MMCV < 1.8.0 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1508

Bug Fixes

fix ci by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1491
[CI] Fix CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1463

Docs

[DOCs] Add MMYOLO in Readme. by @ysh329 in https://github.com/open-mmlab/mmocr/pull/1475
[Docs] Update contributing.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1490

New Contributors

@ysh329 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1475

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.6.2...v0.6.3

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v1.0.0rc3

Highlights

We release several pretrained models using oCLIP-ResNet as the backbone, which is a ResNet variant trained with oCLIP and can significantly boost the performance of text detection models.
Preparing datasets is troublesome and tedious, especially in OCR domain where multiple datasets are usually required. In order to free our users from laborious work, we designed a Dataset Preparer to help you get a bunch of datasets ready for use, with only one line of command! Dataset Preparer is also crafted to consist of a series of reusable modules, each responsible for handling one of the standardized phases throughout the preparation process, shortening the development cycle on supporting new datasets.

New Features & Enhancements

Add Dataset Preparer by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1484
support modified resnet structure used in oCLIP by @HannibalAPE in https://github.com/open-mmlab/mmocr/pull/1458
Add oCLIP configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1509

Docs

Update install.md by @rogachevai in https://github.com/open-mmlab/mmocr/pull/1494
Refine some docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1455
Update some dataset preparer related docs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1502
oclip readme by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1505

Bug Fixes

Fix offline_eval error caused by new data flow by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1500

New Contributors

@rogachevai made their first contribution in https://github.com/open-mmlab/mmocr/pull/1494
@HannibalAPE made their first contribution in https://github.com/open-mmlab/mmocr/pull/1458

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc2...v1.0.0rc3

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v1.0.0rc2

This release relaxes the version requirement of MMEngine to >=0.1.0, < 1.0.0.

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc1...v1.0.0rc2

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v0.6.2

Highlights

It's now possible to train/test models through Python Interface. For example, you can train a model under mmocr/ directory in this way:

```python

an example of how to use such modifications is shown as the following:

from mmocr.tools.train import TrainArg, parseargs, runtraincmd args = TrainArg(config='/path/to/config.py') args.addarg('--work-dir', '/path/to/dir') args = parseargs(args.arglist) runtraincmd(args) ```

See PR #1138 for more details.

Besides, release candidates for MMOCR 1.0 with tons of new features are available at 1.x branch now! Check out the changelog for more information about the features, and maintenance plan for how we will maintain MMOCR in the future.

New Features

Adding test & train API to be used directly in code by @wybryan in https://github.com/open-mmlab/mmocr/pull/1138
Let ResizeOCR full support mmcv.impad's pad_val parameters by @hsiehpinghan in https://github.com/open-mmlab/mmocr/pull/1437

Bug Fixes

Fix ABINet config by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1256
Fix Recognition Score Normalization Issue by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1333
Remove maxseqlen inconsistency by @antoniolanza1996 in https://github.com/open-mmlab/mmocr/pull/1433
box points ordering by @yjmm10 in https://github.com/open-mmlab/mmocr/pull/1205
Correct spelling by misspelling 'preperties' to 'properties' by @JunYao1020 in https://github.com/open-mmlab/mmocr/pull/1446

Docs

Demo, experiments and live inference API on Tiyaro by @Venkat2811 in https://github.com/open-mmlab/mmocr/pull/1272
Update 1.x info by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1369
Add global notes to the docs and the version switcher menu by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1406
Logger Hook Config Updated to Add WandB by @Nourollah in https://github.com/open-mmlab/mmocr/pull/1345

New Contributors

@Venkat2811 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1272
@wybryan made their first contribution in https://github.com/open-mmlab/mmocr/pull/1139
@hsiehpinghan made their first contribution in https://github.com/open-mmlab/mmocr/pull/1437
@yjmm10 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1205
@JunYao1020 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1446
@Nourollah made their first contribution in https://github.com/open-mmlab/mmocr/pull/1345

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.6.1...v0.6.2

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v1.0.0rc1

Highlights

This release fixes a severe bug causing inaccurate metric reports in multi-GPU training. Together with the fix, weights for all the text recognition models in MMOCR 1.0 architecture are released. The inference shorthand for them are also added back to ocr.py. Besides, more documentation chapters are available now.

New Features & Enhancements

Simplify the Mask R-CNN config by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1391
auto scale lr by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1326
Update paths to pretrain weights by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1416
Streamline duplicated splitresult in panpostprocessor by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1418
Update model links in ocr.py and inference.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1431
Update rec configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1417
Visualizer refine by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1411
Support get flops and parameters in dev-1.x by @vansin in https://github.com/open-mmlab/mmocr/pull/1414

Docs

intersphinx and api by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1367
Fix quickrun by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1374
Fix some docs issues by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1385
Add Documents for DataElements by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1381
config english by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1372
Metrics by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1399
Add version switcher to menu by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1407
Data Transforms by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1392
Fix inference docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1415
Fix some docs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1410
Add maintenance plan to migration guide by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1413
Update Recog Models by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1402

Bug Fixes

clear metric.results only done in main process by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1379
Fix a bug in MMDetWrapper by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1393
Fix browse_dataset.py by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1398
ImgAugWrapper: Do not cilp polygons if not applicable by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1231
Fix CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1365
Fix merge stage test by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1370
Del CI support for torch 1.5.1 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1371
Test windows cu111 by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1373
Fix windows CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1387
Upgrade pre commit hooks by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1429
Skip invalid augmented polygons in ImgAugWrapper by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1434

New Contributors

@vansin made their first contribution in https://github.com/open-mmlab/mmocr/pull/1414

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v1.0.0rc0...v1.0.0rc1

- Python
Published by gaotongxiao over 3 years ago

mmocr - MMOCR Release v1.0.0rc0

We are excited to announce the release of MMOCR 1.0.0rc0! MMOCR 1.0.0rc0 is the first version of MMOCR 1.x, a part of the OpenMMLab 2.0 projects. Built upon the new training engine, MMOCR 1.x unifies the interfaces of dataset, models, evaluation, and visualization with faster training and testing speed.

Highlights

New engines. MMOCR 1.x is based on MMEngine, which provides a general and powerful runner that allows more flexible customizations and significantly simplifies the entrypoints of high-level interfaces.
Unified interfaces. As a part of the OpenMMLab 2.0 projects, MMOCR 1.x unifies and refactors the interfaces and internal logics of train, testing, datasets, models, evaluation, and visualization. All the OpenMMLab 2.0 projects share the same design in those interfaces and logics to allow the emergence of multi-task/modality algorithms.
Cross project calling. Benefiting from the unified design, you can use the models implemented in other OpenMMLab projects, such as MMDet. We provide an example of how to use MMDetection's Mask R-CNN through MMDetWrapper. Check our documents for more details. More wrappers will be released in the future.
Stronger visualization. We provide a series of useful tools which are mostly based on brand-new visualizers. As a result, it is more convenient for the users to explore the models and datasets now.
More documentation and tutorials. We add a bunch of documentation and tutorials to help users get started more smoothly. Read it here.

Breaking Changes

We briefly list the major breaking changes here. We also have the migration guide that provides complete details and migration instructions.

Dependencies

MMOCR 1.x relies on MMEngine to run. MMEngine is a new foundational library for training deep learning models in OpenMMLab 2.0 models. The dependencies of file IO and training are migrated from MMCV 1.x to MMEngine.
MMOCR 1.x relies on MMCV>=2.0.0rc0. Although MMCV no longer maintains the training functionalities since 2.0.0rc0, MMOCR 1.x relies on the data transforms, CUDA operators, and image processing interfaces in MMCV. Note that the package mmcv is the version that provide pre-built CUDA operators and mmcv-lite does not since MMCV 2.0.0rc0, while mmcv-full has been deprecated.

Training and testing

MMOCR 1.x uses Runner in MMEngine rather than that in MMCV. The new Runner implements and unifies the building logic of dataset, model, evaluation, and visualizer. Therefore, MMOCR 1.x no longer maintains the building logics of those modules in mmocr.train.apis and tools/train.py. Those code have been migrated into MMEngine. Please refer to the migration guide of Runner in MMEngine for more details.
The Runner in MMEngine also supports testing and validation. The testing scripts are also simplified, which has similar logic as that in training scripts to build the runner.
The execution points of hooks in the new Runner have been enriched to allow more flexible customization. Please refer to the migration guide of Hook in MMEngine for more details.
Learning rate and momentum schedules has been migrated from Hook to Parameter Scheduler in MMEngine. Please refer to the migration guide of Parameter Scheduler in MMEngine for more details.

Configs

The Runner in MMEngine uses a different config structures to ease the understanding of the components in runner. Users can read the config example of MMOCR or refer to the migration guide in MMEngine for migration details.
The file names of configs and models are also refactored to follow the new rules unified across OpenMMLab 2.0 projects. Please refer to the user guides of config for more details.

Dataset

The Dataset classes implemented in MMOCR 1.x all inherits from the BaseDetDataset, which inherits from the BaseDataset in MMEngine. There are several changes of Dataset in MMOCR 1.x.

All the datasets support serializing the data list to reduce the memory when multiple workers are built to accelerate data loading.
The interfaces are changed accordingly.

Data Transforms

Data transforms in MMOCR 1.x all inherits from those in MMCV>=2.0.0rc0, which follows a new convention in OpenMMLab 2.0 projects. The changes are listed below:

The interfaces are also changed. Please refer to the API Reference
The functionalities of some data transforms (e.g., Resize) are decomposed into several transforms.
The same data transforms in different OpenMMLab 2.0 libraries have the same augmentation implementation and the logic of the same arguments, i.e., Resize in MMDet 3.x and MMOCR 1.x will resize the image in the exact same manner given the same arguments.

Model

The models in MMOCR 1.x all inherit from BaseModel in MMEngine, which defines a new convention of models in OpenMMLab 2.0 projects. Users can refer to the tutorial of model in MMEngine for more details. Accordingly, there are several changes as the following:

The model interfaces, including the input and output formats, are significantly simplified and unified following the new convention in MMOCR 1.x. Specifically, all the input data in training and testing are packed into inputs and data_samples, where inputs contains model inputs like a list of image tensors, and data_samples contains other information of the current data sample such as ground truths and model predictions. In this way, different tasks in MMOCR 1.x can share the same input arguments, which makes the models more general and suitable for multi-task learning.
The model has a data preprocessor module, which is used to pre-process the input data of model. In MMOCR 1.x, the data preprocessor usually does the necessary steps to form the input images into a batch, such as padding. It can also serve as a place for some special data augmentations or more efficient data transformations like normalization.
The internal logic of model has been changed. In MMOCR 0.x, model used forward_train and simple_test to deal with different model forward logics. In MMOCR 1.x and OpenMMLab 2.0, the forward function has three modes: loss, predict, and tensor for training, inference, and tracing or other purposes, respectively. The forward function calls self.loss(), self.predict(), and self._forward() given the modes loss, predict, and tensor, respectively.

Evaluation

MMOCR 1.x mainly implements corresponding metrics for each task, which are manipulated by Evaluator to complete the evaluation. In addition, users can build an evaluator in MMOCR 1.x to conduct offline evaluation, i.e., evaluate predictions that may not be produced by MMOCR, prediction follows our dataset conventions. More details can be find in the Evaluation Tutorial in MMEngine.

Visualization

The functions of visualization in MMOCR 1.x are removed. Instead, in OpenMMLab 2.0 projects, we use Visualizer to visualize data. MMOCR 1.x implements TextDetLocalVisualizer, TextRecogLocalVisualizer, and KIELocalVisualizer to allow visualization of ground truths, model predictions, and feature maps, etc., at any place, for the three tasks supported in MMOCR. It also supports dumping the visualization data to any external visualization backends such as Tensorboard and Wandb. Check our Visualization Document for more details.

Improvements

Most models enjoy a performance improvement from the new framework and refactor of data transforms. For example, in MMOCR 1.x, DBNet-R50 achieves 0.854 hmean score on ICDAR 2015, while the counterpart can only get 0.840 hmean score in MMOCR 0.x.
Support mixed precision training of most of the models. However, the rest models are not supported yet because the operators they used might not be representable in fp16. We will update the documentation and list the results of mixed precision training.

Ongoing changes

Test-time augmentation: which was supported in MMOCR 0.x, is not implemented yet in this version due to limited time slot. We will support it in the following releases with a new and simplified design.
Inference interfaces: unified inference interfaces will be supported in the future to ease the use of released models.
Interfaces of useful tools that can be used in notebook: more useful tools that are implemented in the tools/ directory will have their python interfaces so that they can be used through notebook and in downstream libraries.
Documentation: we will add more design docs, tutorials, and migration guidance so that the community can deep dive into our new design, participate the future development, and smoothly migrate downstream libraries to MMOCR 1.x.

- Python
Published by gaotongxiao almost 4 years ago

mmocr - MMOCR Release v0.6.1

Highlights

ArT dataset is available for text detection and recognition!
Fix several bugs that affects the correctness of the models.
Thanks to MIM, our installation is much simpler now! The docs has been renewed as well.

New Features & Enhancements

Add ArT by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1006
add ABINet_Vision api by @Abdelrahman350 in https://github.com/open-mmlab/mmocr/pull/1041
add codespell ignore and use mdformat by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/1022
Add mim to extras_requrie to setup.py, update mminstall… by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1062
Simplify normalized edit distance calculation by @maxbachmann in https://github.com/open-mmlab/mmocr/pull/1060
Test mim in CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1090
Remove redundant steps by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1091
Update links to SDMGR links by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1252

Bug Fixes

Remove unnecessary requirements by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1000
Remove confusing img_scales in pipelines by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1007
inplace operator "+=" will cause RuntimeError when model backward by @garvan2021 in https://github.com/open-mmlab/mmocr/pull/1018
Fix a typo problem in MASTER by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1031
Fix config name of MASTER in ocr.py by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1044
Relax OpenCV requirement by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1061
Restrict the minimum version of OpenCV to avoid potential vulnerability by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1065
typo by @tpoisonooo in https://github.com/open-mmlab/mmocr/pull/1024
Fix a typo in setup.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1095
fix #1067: add torchserve DockerFile and fix bugs by @Hegelim in https://github.com/open-mmlab/mmocr/pull/1073
Incorrect filename in labelme_converter.py by @xiefeifeihu in https://github.com/open-mmlab/mmocr/pull/1103
Fix dataset configs by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1106
Fix #1098: normalize text recognition scores by @Hegelim in https://github.com/open-mmlab/mmocr/pull/1119
Update STSAMJ_train.py by @MingyuLau in https://github.com/open-mmlab/mmocr/pull/1117
PSENet metafile by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1121
Flexible ways of getting file name by @balandongiv in https://github.com/open-mmlab/mmocr/pull/1107
Updating edge-embeddings after each GNN layer by @amitbcp in https://github.com/open-mmlab/mmocr/pull/1134
links update by @TekayaNidham in https://github.com/open-mmlab/mmocr/pull/1141
bug fix: access params by cfg.get by @doem97 in https://github.com/open-mmlab/mmocr/pull/1145
Fix a bug in LmdbAnnFileBackend that cause breaking in Synthtext detection training by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1159
Fix typo of --lmdb-map-size default value by @easilylazy in https://github.com/open-mmlab/mmocr/pull/1147
Fixed docstring syntax error of line 19 & 21 by @APX103 in https://github.com/open-mmlab/mmocr/pull/1157
Update lmdb_converter and ct80 cropped image source in document by @doem97 in https://github.com/open-mmlab/mmocr/pull/1164
MMCV compatibility due to outdated MMDet by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1192
Update maximum version of mmcv by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1219
Update ABINet links for main by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1221
Update owners by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1248
Add back some missing fields in configs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1171

Docs

Fix typos by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/1001
Configure Myst-parser to parse anchor tag by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1012
Fix a error in docs/en/tutorials/dataset_types.md by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1034
Update readme according to the guideline by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1047
Limit markdown version by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1172
Limit extension versions by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/1210
Update installation guide by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1254
Update image link @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/1255

New Contributors

@tpoisonooo made their first contribution in https://github.com/open-mmlab/mmocr/pull/1024
@Abdelrahman350 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1041
@Hegelim made their first contribution in https://github.com/open-mmlab/mmocr/pull/1073
@xiefeifeihu made their first contribution in https://github.com/open-mmlab/mmocr/pull/1103
@MingyuLau made their first contribution in https://github.com/open-mmlab/mmocr/pull/1117
@balandongiv made their first contribution in https://github.com/open-mmlab/mmocr/pull/1107
@amitbcp made their first contribution in https://github.com/open-mmlab/mmocr/pull/1134
@TekayaNidham made their first contribution in https://github.com/open-mmlab/mmocr/pull/1141
@easilylazy made their first contribution in https://github.com/open-mmlab/mmocr/pull/1147
@APX103 made their first contribution in https://github.com/open-mmlab/mmocr/pull/1157

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.6.0...v0.6.1

- Python
Published by gaotongxiao almost 4 years ago

mmocr - MMOCR Release v0.6.0

Highlights

A new recognition algorithm MASTER has been added into MMOCR, which was the championship solution for the "ICDAR 2021 Competition on Scientific Table Image Recognition to Latex"! The model pre-trained on SynthText and MJSynth is available for testing! Credit to @JiaquanYe
DBNet++ has been released now! A new Adaptive Scale Fusion module has been equipped for feature enhancement. Benefiting from this, the new model achieved 2% better h-mean score than its predecessor on the ICDAR2015 dataset.
Three more dataset converters are added: LSVT, RCTW and HierText. Check the dataset zoo (Det & Recog) to explore further information.
To enhance the data storage efficiency, MMOCR now supports loading both images and labels from .lmdb format annotations for the text recognition task. To enable such a feature, the new lmdb_converter.py is ready for use to pack your cropped images and labels into an lmdb file. For a detailed tutorial, please refer to the following sections and the doc.
Testing models on multiple datasets is a widely used evaluation strategy. MMOCR now supports automatically reporting mean scores when there is more than one dataset to evaluate, which enables a more convenient comparison between checkpoints. Doc
Evaluation is more flexible and customizable now. For text detection tasks, you can set the score threshold range where the best results might come out. (Doc) If too many results are flooding your text recognition train log, you can trim it by specifying a subset of metrics in evaluation config. Check out the Evaluation section for details.
MMOCR provides a script to convert the .json labels obtained by the popular annotation toolkit Labelme to MMOCR-supported data format. @Y-M-Y contributed a log analysis tool that helps users gain a better understanding of the entire training process. Read tutorial docs to get started.

Lmdb Dataset

Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now officially supports loading images and labels from lmdb datasets via a new pipeline LoadImageFromLMDB. This section is intended to serve as a quick walkthrough for you to master this update and apply it to facilitate your research.

Specifications

To better align with the academic community, MMOCR now requires the following specifications for lmdb datasets:

The parameter describing the data volume of the dataset is num-samples instead of total_number (deprecated).
Images and labels are stored with keys in the form of image-000000001 and label-000000001, respectively.

Usage

Use existing academic lmdb datasets if they meet the specifications; or the tool provided by MMOCR to pack images & annotations into a lmdb dataset.

Previously, MMOCR had a function txt2lmdb (deprecated) that only supported converting labels to lmdb format. However, it is quite different from academic lmdb datasets, which usually contain both images and labels. Now MMOCR provides a new utility lmdb_converter to convert recognition datasets with both images and labels to lmdb format.
Say that your recognition data in MMOCR's format are organized as follows. (See an example in ocrtoydataset).

```text

Directory structure

├──img_path | |—— img1.jpg | |—— img2.jpg | |—— ... |——label.txt (or label.jsonl)

Annotation format

label.txt: img1.jpg HELLO img2.jpg WORLD ...

label.jsonl: {'filename':'img1.jpg', 'text':'HELLO'} {'filename':'img2.jpg', 'text':'WORLD'} ... ```
Then pack these files up:

bash python tools/data/utils/lmdb_converter.py {PATH_TO_LABEL} {OUTPUT_PATH} --i {PATH_TO_IMAGES}
Check out tools.md for more details.

The second step is to modify the configuration files. For example, to train CRNN on MJ and ST datasets:

Set parser as LineJsonParser and file_format as 'lmdb' in dataset config

python # configs/_base_/recog_datasets/ST_MJ_train.py train1 = dict( type='OCRDataset', img_prefix=train_img_prefix1, ann_file=train_ann_file1, loader=dict( type='AnnFileLoader', repeat=1, file_format='lmdb', parser=dict( type='LineJsonParser', keys=['filename', 'text'], )), pipeline=None, test_mode=False)
Use LoadImageFromLMDB in pipeline:

```python

configs/base/recogpipelines/crnnpipeline.py

trainpipeline = [ dict(type='LoadImageFromLMDB', colortype='grayscale'), ... ```

You are good to go! Start training and MMOCR will load data from your lmdb dataset.

New Features & Enhancements

Add analyze_logs in tools and its description in docs by @Y-M-Y in https://github.com/open-mmlab/mmocr/pull/899
Add LSVT Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/896
Add RCTW dataset converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/914
Support computing mean scores in UniformConcatDataset by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/981
Support loading images and labels from lmdb file by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/982
Add recog2lmdb and new toy dataset files by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/979
Add labelme converter for textdet and textrecog by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/972
Update CircleCI configs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/918
Update Git Action by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/930
More customizable fields in dataloaders by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/933
Skip CIs when docs are modified by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/941
Rename Github tests, fix ignored paths by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/946
Support latest MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/959
Support dynamic threshold range in eval_hmean by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/962
Update the version requirement of mmdet in docker by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/966
Replace opencv-python-headless with open-python by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/970
Update Dataset Configs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/980
Add SynthText dataset config by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/983
Automatically report mean scores when applicable by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/995
Add DBNet++ by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/973
Add MASTER by @JiaquanYe in https://github.com/open-mmlab/mmocr/pull/807
Allow choosing metrics to report in text recognition tasks by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/989
Add HierText converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/948
Fix lint_only in CircleCI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/998

Bug Fixes

Fix CircleCi Main Branch Accidentally Run PR Stage Test by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/927
Fix a deprecate warning about mmdet.datasets.pipelines.formating by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/944
Fix a Bug in ResNet plugin by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/967
revert a wrong setting in db_r18 cfg by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/978
Fix TotalText Anno version issue by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/945
Update installation step of albumentations by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/984
Fix ImgAug transform by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/949
Fix GPG key error in CI and docker by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/988
update label.lmdb by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/991
correct meta key by @garvan2021 in https://github.com/open-mmlab/mmocr/pull/926
Use new image by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/976
Fix Data Converter Issues by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/955

Docs

Update CONTRIBUTING.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/905
Fix the misleading description in test.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/908
Update recog.md for lmdb Generation by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/934
Add MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/954
Add wechat QR code to CN readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/960
Update CONTRIBUTING.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/947
Use QR codes from MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/971
Renew dataset_types.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/997

New Contributors

@Y-M-Y made their first contribution in https://github.com/open-mmlab/mmocr/pull/899

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.5.0...v0.6.0

- Python
Published by gaotongxiao about 4 years ago

mmocr - MMOCR Release v0.5.0

Highlights

MMOCR now supports SPACE recognition! (What a prominent feature!) Users only need to convert the recognition annotations that contain spaces from a plain .txt file to JSON line format .jsonl, and then revise a few configurations to enable the LineJsonParser. For more information, please read our step-by-step tutorial.
Tesseract is now available in MMOCR! While MMOCR is more flexible to support various downstream tasks, users might sometimes not be satisfied with DL models and would like to turn to effective legacy solutions. Therefore, we offer this option in mmocr.utils.ocr by wrapping Tesseract as a detector and/or recognizer. Users can easily create an MMOCR object by MMOCR(det=’Tesseract’, recog=’Tesseract’). Credit to @garvan2021
We release data converters for 16 widely used OCR datasets, including multiple scenarios such as document, handwritten, and scene text. Now it is more convenient to generate annotation files for these datasets. Check the dataset zoo ( Det & Recog ) to explore further information.
Special thanks to @EighteenSprings @BeyondYourself @yangrisheng, who had actively participated in documentation translation!

Migration Guide - ResNet

Some refactoring processes are still going on. For text recognition models, we unified the ResNet-like architectures which are used as backbones. By introducing stage-wise and block-wise plugins, the refactored ResNet is highly flexible to support existing models, like ResNet31 and ResNet45, and other future designs of ResNet variants.

Plugin

Plugin is a module category inherited from MMCV's implementation of PLUGIN_LAYERS, which can be inserted between each stage of ResNet or into a basicblock. You can find a simple implementation of plugin at mmocr/models/textrecog/plugins/common.py, or click the button below.

Plugin Example
```python @PLUGIN_LAYERS.register_module() class Maxpool2d(nn.Module): """A wrapper around nn.Maxpool2d(). Args: kernel_size (int or tuple(int)): Kernel size for max pooling layer stride (int or tuple(int)): Stride for max pooling layer padding (int or tuple(int)): Padding for pooling layer """ def __init__(self, kernel_size, stride, padding=0, **kwargs): super(Maxpool2d, self).__init__() self.model = nn.MaxPool2d(kernel_size, stride, padding) def forward(self, x): """ Args: x (Tensor): Input feature map Returns: Tensor: The tensor after Maxpooling layer. """ return self.model(x) ```

Stage-wise Plugins

ResNet is composed of stages, and each stage is composed of blocks. E.g., ResNet18 is composed of 4 stages, and each stage is composed of basicblocks. For each stage, we provide two ports to insert stage-wise plugins by giving plugins parameters in ResNet.

text [port1: before stage] ---> [stage] ---> [port2: after stage]
E.g. Using a ResNet with four stages as example. Suppose we want to insert an additional convolution layer before each stage, and an additional convolution layer at stage 1, 2, 4. Then you can define the special ResNet18 like this

python resnet18_speical = ResNet( # for simplicity, some required # parameters are omitted plugins=[ dict( cfg=dict( type='ConvModule', kernel_size=3, stride=1, padding=1, norm_cfg=dict(type='BN'), act_cfg=dict(type='ReLU')), stages=(True, True, True, True), position='before_stage') dict( cfg=dict( type='ConvModule', kernel_size=3, stride=1, padding=1, norm_cfg=dict(type='BN'), act_cfg=dict(type='ReLU')), stages=(True, True, False, True), position='after_stage') ])
You can also insert more than one plugin in each port and those plugins will be executed in order. Let's take ResNet in MASTER as an example:
Multiple Plugins Example
- ResNet in Master is based on ResNet31. And after each stage, a module named GCAModule will be used. The GCAModule is inserted before the stage-wise convolution layer in ResNet31. In conlusion, there will be two plugins at after_stage port in the same time.
  
```python resnetmaster = ResNet( # for simplicity, some required # parameters are omitted plugins=[ dict( cfg=dict(type='Maxpool2d', kernelsize=2, stride=(2, 2)), stages=(True, True, False, False), position='beforestage'), dict( cfg=dict(type='Maxpool2d', kernelsize=(2, 1), stride=(2, 1)), stages=(False, False, True, False), position='beforestage'), dict( cfg=dict(type='GCAModule', kernelsize=3, stride=1, padding=1), stages=[True, True, True, True], position='afterstage'), dict( cfg=dict( type='ConvModule', kernelsize=3, stride=1, padding=1, normcfg=dict(type='BN'), actcfg=dict(type='ReLU')), stages=(True, True, True, True), position='after_stage') ])

```
In each plugin, we will pass two parameters (in_channels, out_channels) to support operations that need the information of current channels.

Block-wise Plugin (Experimental)

We also refactored the BasicBlock used in ResNet. Now it can be customized with block-wise plugins. Check here for more details.
BasicBlock is composed of two convolution layer in the main branch and a shortcut branch. We provide four ports to insert plugins.

text [port1: before_conv1] ---> [conv1] ---> [port2: after_conv1] ---> [conv2] ---> [port3: after_conv2] ---> +(shortcut) ---> [port4: after_shortcut]
In each plugin, we will pass a parameter in_channels to support operations that need the information of current channels.
E.g. Build a ResNet with customized BasicBlock with an additional convolution layer before conv1:

Block-wise Plugin Example
```python resnet_31 = ResNet( in_channels=3, stem_channels=[64, 128], block_cfgs=dict(type='BasicBlock'), arch_layers=[1, 2, 5, 3], arch_channels=[256, 256, 512, 512], strides=[1, 1, 1, 1], plugins=[ dict( cfg=dict(type='Maxpool2d', kernel_size=2, stride=(2, 2)), stages=(True, True, False, False), position='before_stage'), dict( cfg=dict(type='Maxpool2d', kernel_size=(2, 1), stride=(2, 1)), stages=(False, False, True, False), position='before_stage'), dict( cfg=dict( type='ConvModule', kernel_size=3, stride=1, padding=1, norm_cfg=dict(type='BN'), act_cfg=dict(type='ReLU')), stages=(True, True, True, True), position='after_stage') ]) ```

Full Examples

ResNet without plugins

- ResNet45 is used in ASTER and ABINet without any plugins. ```python resnet45_aster = ResNet( in_channels=3, stem_channels=[64, 128], block_cfgs=dict(type='BasicBlock', use_conv1x1='True'), arch_layers=[3, 4, 6, 6, 3], arch_channels=[32, 64, 128, 256, 512], strides=[(2, 2), (2, 2), (2, 1), (2, 1), (2, 1)]) resnet45_abi = ResNet( in_channels=3, stem_channels=32, block_cfgs=dict(type='BasicBlock', use_conv1x1='True'), arch_layers=[3, 4, 6, 6, 3], arch_channels=[32, 64, 128, 256, 512], strides=[2, 1, 2, 1, 1]) ```

ResNet with plugins

- ResNet31 is a typical architecture to use stage-wise plugins. Before the first three stages, Maxpooling layer is used. After each stage, a convolution layer with BN and ReLU is used. ```python resnet_31 = ResNet( in_channels=3, stem_channels=[64, 128], block_cfgs=dict(type='BasicBlock'), arch_layers=[1, 2, 5, 3], arch_channels=[256, 256, 512, 512], strides=[1, 1, 1, 1], plugins=[ dict( cfg=dict(type='Maxpool2d', kernel_size=2, stride=(2, 2)), stages=(True, True, False, False), position='before_stage'), dict( cfg=dict(type='Maxpool2d', kernel_size=(2, 1), stride=(2, 1)), stages=(False, False, True, False), position='before_stage'), dict( cfg=dict( type='ConvModule', kernel_size=3, stride=1, padding=1, norm_cfg=dict(type='BN'), act_cfg=dict(type='ReLU')), stages=(True, True, True, True), position='after_stage') ]) ```

Migration Guide - Dataset Annotation Loader

The annotation loaders, LmdbLoader and HardDiskLoader, are unified into AnnFileLoader for a more consistent design and wider support on different file formats and storage backends. AnnFileLoader can load the annotations from disk(default), http and petrel backend, and parse the annotation in txt or lmdb format. LmdbLoader and HardDiskLoader are deprecated, and users are recommended to modify their configs to use the new AnnFileLoader. Users can migrate their legacy loader HardDiskLoader referring to the following example:

```python

Legacy config

train = dict( type='OCRDataset', ... loader=dict( type='HardDiskLoader', ...))

Suggested config

train = dict( type='OCRDataset', ... loader=dict( type='AnnFileLoader', filestoragebackend='disk', file_format='txt', ...)) ```

Similarly, using AnnFileLoader with file_format='lmdb' instead of LmdbLoader is strongly recommended.

New Features & Enhancements

Update mmcv install by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/775
Upgrade isort by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/771
Automatically infer device for inference if not speicifed by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/781
Add open-mmlab precommit hooks by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/787
Add windows CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/790
Add CurvedSyntext150k Converter by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/719
Add FUNSD Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/808
Support loading annotation file with petrel/http backend by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/793
Support different seeds on different ranks by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/820
Support json in recognition converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/844
Add args and docs for multi-machine training/testing by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/849
Add warning info for LineStrParser by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/850
Deploy openmmlab-bot by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/876
Add Tesserocr Inference by @garvan2021 in https://github.com/open-mmlab/mmocr/pull/814
Add LV Dataset Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/871
Add SROIE Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/810
Add NAF Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/815
Add DeText Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/818
Add IMGUR Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/825
Add ILST Converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/833
Add KAIST Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/835
Add IC11 (Born-digital Images) Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/857
Add IC13 (Focused Scene Text) Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/861
Add BID Converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/862
Add Vintext Converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/864
Add MTWI Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/867
Add COCO Text v2 Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/872
Add ReCTS Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/892
Refactor ResNets by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/809

Bug Fixes

Bump mmdet version to 2.20.0 in Dockerfile by @GPhilo in https://github.com/open-mmlab/mmocr/pull/763
Update mmdet version limit by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/773
Minimum version requirement of albumentations by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/769
Disable worker in the dataloader of gpu unit test by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/780
Standardize the type of torch.device in ocr.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/800
Use RECOGNIZER instead of DETECTORS by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/685
Add num_classes to configs of ABINet by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/805
Support loading space character from dict file by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/854
Description in tools/data/utils/txt2lmdb.py by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/870
ignore_index in SARLoss by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/869
Fix a bug that may cause inplace operation error by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/884
Use hyphen instead of underscores in script args by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/890

Docs

Add deprecation message for deploy tools by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/801
Reorganizing OpenMMLab projects in readme by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/806
Add demo/README_zh.md by @EighteenSprings in https://github.com/open-mmlab/mmocr/pull/802
Add detailed version requirement table by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/778
Correct misleading section title in training.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/819
Update README_zh-CN document URL by @BeyondYourself in https://github.com/open-mmlab/mmocr/pull/823
translate testing.md. by @yangrisheng in https://github.com/open-mmlab/mmocr/pull/822
Fix confused description for load-from and resume-from by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/842
Add documents getting_started in docs/zh by @BeyondYourself in https://github.com/open-mmlab/mmocr/pull/841
Add the model serving translation document by @BeyondYourself in https://github.com/open-mmlab/mmocr/pull/845
Update docs about installation on Windows by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/852
Update tutorial notebook by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/853
Update Instructions for New Data Converters by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/900
Brief installation instruction in README by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/897
update doc for ILST, VinText, BID by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/902
Fix typos in readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/903
Recog dataset doc by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/893
Reorganize the directory structure section in det.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/894

New Contributors

@GPhilo made their first contribution in https://github.com/open-mmlab/mmocr/pull/763
@xinke-wang made their first contribution in https://github.com/open-mmlab/mmocr/pull/801
@EighteenSprings made their first contribution in https://github.com/open-mmlab/mmocr/pull/802
@BeyondYourself made their first contribution in https://github.com/open-mmlab/mmocr/pull/823
@yangrisheng made their first contribution in https://github.com/open-mmlab/mmocr/pull/822
@Mountchicken made their first contribution in https://github.com/open-mmlab/mmocr/pull/844
@garvan2021 made their first contribution in https://github.com/open-mmlab/mmocr/pull/814

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.4.1...v0.5.0

- Python
Published by gaotongxiao about 4 years ago

mmocr - MMOCR Release v0.4.1

Highlights

Visualizing edge weights in OpenSet KIE is now supported! https://github.com/open-mmlab/mmocr/pull/677
Some configurations have been optimized to significantly speed up the training and testing processes! Don't worry - you can still tune these parameters in case these modifications do not work. https://github.com/open-mmlab/mmocr/pull/757
Now you can use CPU to train/debug your model! https://github.com/open-mmlab/mmocr/pull/752
We have fixed a severe bug that causes users unable to call mmocr.apis.test with our pre-built wheels. https://github.com/open-mmlab/mmocr/pull/667

New Features & Enhancements

Show edge score for openset kie by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/677
Download flake8 from github as pre-commit hooks by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/695
Deprecate the support for 'python setup.py test' by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/722
Disable multi-processing feature of cv2 to speed up data loading by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/721
Extend ctw1500 converter to support text fields by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/729
Extend totaltext converter to support text fields by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/728
Speed up training by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/739
Add setup multi-processing both in train and test.py by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/757
Support CPU training/testing by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/752
Support specify gpu for testing and training with gpu-id instead of gpu-ids and gpus by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/756
Remove unnecessary custom_import from test.py by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/758

Bug Fixes

Fix satrn onnxruntime test by @AllentDan in https://github.com/open-mmlab/mmocr/pull/679
Support both ConcatDataset and UniformConcatDataset by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/675
Fix bugs of showresults in singlegpu_test by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/667
Fix a bug for sar decoder when bi-rnn is used by @MhLiao in https://github.com/open-mmlab/mmocr/pull/690
Fix opencv version to avoid some bugs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/694
Fix py39 ci error by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/707
Update visualize.py by @TommyZihao in https://github.com/open-mmlab/mmocr/pull/715
Fix link of config by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/726
Use yaml.safe_load instead of load by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/753
Add necessary keys to test_pipelines to enable test-time visualization by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/754

Docs

Fix recog.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/674
Add config tutorial by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/683
Add MMSelfSup/MMRazor/MMDeploy in readme by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/692
Add recog & det model summary by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/693
Update docs link by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/710
add pull request template.md by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/711
Add website links to readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/731
update readme according to standard by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/742

New Contributors

@MhLiao made their first contribution in https://github.com/open-mmlab/mmocr/pull/690
@TommyZihao made their first contribution in https://github.com/open-mmlab/mmocr/pull/715

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.4.0...v0.4.1

- Python
Published by gaotongxiao over 4 years ago

mmocr - MMOCR Release v0.4.0

Highlights

We release a new text recognition model - ABINet (CVPR 2021, Oral). With dedicated model design and useful data augmentation transforms, ABINet achieves the best performance on irregular text recognition tasks. Check it out!
We are also working hard to fulfill the requests from our community. OpenSet KIE is one of the achievements, which extends the application of SDMGR from text node classification to node-pair relation extraction. We also provide a demo script to convert WildReceipt to open set domain, though it may not take full advantage of the OpenSet format. For more information, read our tutorial.
APIs of models can be exposed through TorchServe. Docs

Breaking Changes & Migration Guide

Postprocessor

Some refactoring processes are still going on. For all text detection models, we unified their decode implementations into a new module category, POSTPROCESSOR, which is responsible for decoding different raw outputs into boundary instances. In all text detection configs, the text_repr_type argument in bbox_head is deprecated and will be removed in the future release.

Migration Guide: Find a similar line from detection model's config: text_repr_type=xxx, And replace it with postprocessor=dict(type='{MODEL_NAME}Postprocessor', text_repr_type=xxx)), Take a snippet of PANet's config as an example. Before the change, its config for bbox_head looks like: bbox_head=dict( type='PANHead', text_repr_type='poly', in_channels=[128, 128, 128, 128], out_channels=6, loss=dict(type='PANLoss')), Afterwards: bbox_head=dict( type='PANHead', in_channels=[128, 128, 128, 128], out_channels=6, loss=dict(type='PANLoss'), postprocessor=dict(type='PANPostprocessor', text_repr_type='poly')), There are other postprocessors and each takes different arguments. Interested users can find their interfaces or implementations in mmocr/models/textdet/postprocess or through our api docs.

New Config Structure

We reorganized the configs/ directory by extracting reusable sections into configs/_base_. Now the directory tree of configs/_base_ is organized as follows:

_base_ ├── det_datasets ├── det_models ├── det_pipelines ├── recog_datasets ├── recog_models ├── recog_pipelines └── schedules

Most of model configs are making full use of base configs now, which makes the overall structural clearer and facilitates fair comparison across models. Despite the seemingly significant hierarchical difference, these changes would not break the backward compatibility as the names of model configs remain the same.

New Features

Support openset kie by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/498
Add converter for the Open Images v5 text annotations by Krylov et al. by @baudm in https://github.com/open-mmlab/mmocr/pull/497
Support Chinese for kie show result by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/464
Add TorchServe support for text detection and recognition by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/522
Save filename in text detection test results by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/570
Add codespell pre-commit hook and fix typos by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/520
Avoid duplicate placeholder docs in CN by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/582
Save results to json file for kie. by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/589
Add SAR_CN to ocr.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/579
mim extension for windows by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/641
Support muitiple pipelines for different datasets by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/657
ABINet Framework by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/651

Refactoring

Refactor textrecog config structure by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/617
Refactor text detection config by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/626
refactor transformer modules by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/618
refactor textdet postprocess by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/640

Docs

C++ example section by @apiaccess21 in https://github.com/open-mmlab/mmocr/pull/593
install.md Chinese section by @A465539338 in https://github.com/open-mmlab/mmocr/pull/364
Add Chinese Translation of deployment.md. by @fatfishZhao in https://github.com/open-mmlab/mmocr/pull/506
Fix a model link and add the metafile for SATRN by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/473
Improve docs style by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/474
Enhancement & sync Chinese docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/492
TorchServe docs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/539
Update docs menu by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/564
Docs for KIE CloseSet & OpenSet by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/573
Fix broken links by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/576
Docstring for text recognition models by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/562
Add MMFlow & MIM by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/597
Add MMFewShot by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/621
Update model readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/604
Add input size check to model_inference by @mpena-vina in https://github.com/open-mmlab/mmocr/pull/633
Docstring for textdet models by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/561
Add MMHuman3D in readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/644
Use shared menu from theme instead by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/655
Refactor docs structure by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/662
Docs fix by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/664

Enhancements

Use bounding box around polygon instead of within polygon by @alexander-soare in https://github.com/open-mmlab/mmocr/pull/469
Add CITATION.cff by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/476
Add py3.9 CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/475
update model-index.yml by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/484
Use container in CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/502
CircleCI Setup by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/611
Remove unnecessary custom_import from train.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/603
Change the upper version of mmcv to 1.5.0 by @zhouzaida in https://github.com/open-mmlab/mmocr/pull/628
Update CircleCI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/631
Pass custom_hooks to MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/609
Skip CI when some specific files were changed by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/642
Add markdown linter in pre-commit hook by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/643
Use shape from loaded image by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/652
Cancel previous runs that are not completed by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/666

Bug Fixes

Modify algorithm "sar" weights path in metafile by @ShoupingShan in https://github.com/open-mmlab/mmocr/pull/581
Fix Cuda CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/472
Fix image export in test.py for KIE models by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/486
Allow invalid polygons in intersection and union by default by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/471
Update checkpoints' links for SATRN by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/518
Fix converting to onnx bug because of changing key from imgshape to resizeshape by @Harold-lkk in https://github.com/open-mmlab/mmocr/pull/523
Fix PyTorch 1.6 incompatible checkpoints by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/540
Fix paper field in metafiles by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/550
Unify recognition task names in metafiles by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/548
Fix py3.9 CI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/563
Always map location to cpu when loading checkpoint by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/567
Fix wrong model builder in recogtestimgs by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/574
Improve dbnet r50 by fixing img std by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/578
Fix resource warning: unclosed file by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/577
Fix bug that same startpoint for different texts in drawtextsbypil by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/587
Keep original texts for kie by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/588
Fix random seed by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/600
Fix DBNet_r50 config by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/625
Change SBC case to DBC case by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/632
Fix kie demo by @innerlee in https://github.com/open-mmlab/mmocr/pull/610
fix type check by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/650
Remove depreciated image validator in totaltext converter by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/661
Fix change locals() dict by @Fei-Wang in https://github.com/open-mmlab/mmocr/pull/663
fix #614: textsnake targets by @HolyCrap96 in https://github.com/open-mmlab/mmocr/pull/660

New Contributors

@alexander-soare made their first contribution in https://github.com/open-mmlab/mmocr/pull/469
@A465539338 made their first contribution in https://github.com/open-mmlab/mmocr/pull/364
@fatfishZhao made their first contribution in https://github.com/open-mmlab/mmocr/pull/506
@baudm made their first contribution in https://github.com/open-mmlab/mmocr/pull/497
@ShoupingShan made their first contribution in https://github.com/open-mmlab/mmocr/pull/581
@apiaccess21 made their first contribution in https://github.com/open-mmlab/mmocr/pull/593
@zhouzaida made their first contribution in https://github.com/open-mmlab/mmocr/pull/628
@mpena-vina made their first contribution in https://github.com/open-mmlab/mmocr/pull/633
@Fei-Wang made their first contribution in https://github.com/open-mmlab/mmocr/pull/663

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.3.0...v0.4.0

- Python
Published by gaotongxiao over 4 years ago

mmocr - MMOCR Release v0.3.0

Highlights

We add a new text recognition model -- SATRN! Its pretrained checkpoint achieves the best performance over other provided text recognition models. A lighter version of SATRN is also released which can obtain ~98% of the performance of the original model with only 45 MB in size. (@2793145003) #405
Improve the demo script, ocr.py, which supports applying end-to-end text detection, text recognition and key information extraction models on images with easy-to-use commands. Users can find its full documentation in the demo section. (@samayala22, @manjrekarom) #371, #386, #400, #374, #428
Our documentation is reorganized into a clearer structure. More useful contents are on the way! #409, #454
The requirement of Polygon3 is removed since this project is no longer maintained or distributed. We unified all its references to equivalent substitutions in shapely instead. #448

Breaking Changes & Migration Guide

Upgrade version requirement of MMDetection to 2.14.0 to avoid bugs #382
MMOCR now has its own model and layer registries inherited from MMDetection's or MMCV's counterparts. (#436) The modified hierarchical structure of the model registries are now organized as follows. text mmcv.MODELS -> mmdet.BACKBONES -> BACKBONES mmcv.MODELS -> mmdet.NECKS -> NECKS mmcv.MODELS -> mmdet.ROI_EXTRACTORS -> ROI_EXTRACTORS mmcv.MODELS -> mmdet.HEADS -> HEADS mmcv.MODELS -> mmdet.LOSSES -> LOSSES mmcv.MODELS -> mmdet.DETECTORS -> DETECTORS mmcv.ACTIVATION_LAYERS -> ACTIVATION_LAYERS mmcv.UPSAMPLE_LAYERS -> UPSAMPLE_LAYERSTo migrate your old implementation to our new backend, you need to change the import path of any registries and their corresponding builder functions (includingbuild_detectors) frommmdet.models.buildertommocr.models.builder. If you have referred to any model or layer of MMDetection or MMCV in your model config, you need to addmmdet.ormmcv.` prefix to its name to inform the model builder of the right namespace to work on.

Interested users may check out MMCV's tutorial on Registry for in-depth explanations on its mechanism.

New Features

Automatically replace SyncBN with BN for inference #420, #453
Support batch inference for CRNN and SegOCR #407
Support exporting documentation in pdf or epub format #406
Support persistent_workers option in data loader #459

Bug Fixes

Remove depreciated key in kietestimgs.py #381
Fix dimension mismatch in batch testing/inference of DBNet #383
Fix the problem of dice loss which stays at 1 with an empty target given #408
Fix a wrong link in ocr.py (@naarkhoo) #417
Fix undesired assignment to "pretrained" in test.py #418
Fix a problem in polygon generation of DBNet #421, #443
Skip invalid annotations in totaltext_converter #438
Add zero division handler in poly utils, remove Polygon3 #448

Improvements

Replace lanms-proper with lanms-neo to support installation on Windows (with special thanks to @gen-ko who has re-distributed this package!)
Support MIM #394
Add tests for PyTorch 1.9 in CI #401
Enables fullscreen layout in readthedocs #413
General documentation enhancement #395
Update version checker #427
Add copyright info #439
Update citation information #440

Contributors

We thank @2793145003, @samayala22, @manjrekarom, @naarkhoo, @gen-ko, @duanjiaqi, @gaotongxiao, @cuhk-hbsun, @innerlee, @wdsd641417025 for their contribution to this release!

- Python
Published by gaotongxiao almost 5 years ago

mmocr - MMOCR Release v0.2.1

Highlights 1. Upgrade to use MMCV-full >= 1.3.8 and MMDetection >= 2.13.0 for latest features 2. Add ONNX and TensorRT export tool, supporting the deployment of DBNet, PSENet, PANet and CRNN (experimental) #278, #291, #300, #328 3. Unified parameter initialization method which uses init_cfg in config files #365

New Features

Support TextOCR dataset #293
Support Total-Text dataset #266, #273, #357
Support grouping text detection box into lines #290, #304
Add benchmark_processing script that benchmarks data loading process #261
Add SynthText preprocessor for text recognition models #351, #361
Support batch inference during testing #310
Add user-friendly OCR inference script #366

Bug Fixes

Fix improper class ignorance in SDMGR Loss #221
Fix potential numerical zero division error in DRRG #224
Fix installing requirements with pip and mim #242
Fix dynamic input error of DBNet #269
Fix space parsing error in LineStrParser #285
Fix textsnake decode error #264
Correct isort setup #288
Fix a bug in SDMGR config #316
Fix kietestimg for KIE nonvisual #319
Fix metafiles #342
Fix different device problem in FCENet #334
Ignore improper tailing empty characters in annotation files #358
Docs fixes #247, #255, #265, #267, #268, #270, #276, #287, #330, #355, #367
Fix NRTR config #356, #370

Improvements

Add backend for resizeocr #244
Skip image processing pipelines in SDMGR novisual #260
Speedup DBNet #263
Update mmcv installation method in workflow #323
Add part of Chinese documentations #353, #362
Add support for ConcatDataset with two workflows #348
Add listfromfile and listtofile utils #226
Speed up sort_vertex #239
Support distributed evaluation of KIE #234
Add pretrained FCENet on IC15 #258
Support CPU for OCR demo #227
Avoid extra image pre-processing steps #375

- Python
Published by gaotongxiao almost 5 years ago

mmocr - MMOCR Release v0.2.0

Highlights

Add the NER approach Bert-softmax (NAACL'2019)
Add the text detection method DRRG (CVPR'2020)
Add the text detection method FCENet (CVPR'2021)
Increase the ease of use via adding text detection and recognition end-to-end demo, and colab online demo.
Simplify the installation.

New Features

Add Bert-softmax for Ner task #148
Add DRRG #189
Add FCENet #133
Add end-to-end demo #105
Support batch inference #86 #87 #178
Add TPS preprocessor for text recognition #117 #135
Add demo documentation #151 #166 #168 #170 #171
Add checkpoint for Chinese recognition #156
Add metafile #175 #176 #177 #182 #183
Add support for numpy array inference #74

Bug Fixes

Fix the duplicated point bug due to transform for textsnake #130
Fix CTC loss NaN #159
Fix error raised if result is empty in demo #144
Fix results missing if one image has a large number of boxes #98
Fix package missing in dockerfile #109

Improvements

Simplify installation procedure via removing compiling #188
Speed up panet post processing so that it can detect dense texts #188
Add zh-CN README #70 #95
Support windows #89
Add Colab #147 #199
Add 1-step installation using conda environment #193 #194 #195

- Python
Published by cuhk-hbsun about 5 years ago

mmocr - MMOCR Release v0.1.0

Main Features

Support text detection, text recognition and the corresponding downstream tasks such as key information extraction.
For text detection, support both single-step (PSENet, PANet, DBNet, TextSnake) and two-step (MaskRCNN) methods.
For text recognition, support CTC-loss based method CRNN; Encoder-decoder (with attention) based methods SAR, Robustscanner; Segmentation based method SegOCR; Transformer based method NRTR.
For key information extraction, support GCN based method SDMG-R.
Provide checkpoints and log files for all of the methods above.

- Python
Published by jeffreykuang about 5 years ago