ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
gt_structure_1_3
The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
cataloguessegmentationocr
Dataset and models for catalogs' Layout analysis and HTR
gt-repo-scripts
XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).
gt_structure_1_2
The repo gt_structure_1_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
gt_structure_1_4
About The repo gt_structure_1_4 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
german-brazilian-newspapers-dataset_2
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.
gt_structure_1_1
The repo gt_structure_1_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
german-brazilian-newspapers-dataset_1
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.