ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
https://github.com/bertsky/ocrd_detectron2
OCR-D wrapper for detectron2 based segmentation models
gt-mufilevelrules
OCR-D-Level-Rules can be created automatically with gt-MufiLevelRules from the encodings published by MUFI: The Medieval Unicode Font Initiative.
https://github.com/bertsky/ocrd_wrap
OCR-D wrapper for arbitrary coords-preserving image operations
https://github.com/bertsky/nmalign
forced alignment of lists of string by fuzzy string matching
https://github.com/bertsky/ocrd_doxa
OCR-D wrapper for DoxaPy image binarization via locally adaptive thresholding
https://github.com/bertsky/workflow-configuration
a makefilization for OCR-D workflows, with configuration examples
https://github.com/bertsky/docstruct
Document structure detection from PAGE-XML to METS-XML
https://github.com/bertsky/ocrd_publaynet
convert PubLayNet data into METS/PAGE-XML
gt_structure_1_2
The repo gt_structure_1_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
https://github.com/bertsky/ocrd_origami
OCR-D wrapper for poke1024/origami OLR+OCR
gt-repo-scripts
XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).
gt_structure_1_4
About The repo gt_structure_1_4 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
https://github.com/bertsky/ocrd_jdeskew
OCR-D wrapper for Document Image Skew Estimation using Adaptive Radial Projection
gt_structure_1_1
The repo gt_structure_1_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
gt_structure_1_3
The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.