taulu
Taulu is a Python package designed to segment tabular data in scanned or photographed documents.
handprint
Apply different text recognition services to images of handwritten documents.
cremma-medieval
Transcription corpora for training HTR models for medieval manuscripts from the 12th to the 15th century.
lectaurep-bronod
Lectaurep-Bronod, ground truth for Maitre Bronod's documents (French 18th century)
https://github.com/albertocuadra/helmholtz_htr
Routine to compute the dissipation, dissipation rate, and the solenoidal and compressive parts of a three-dimensional velocity field of a DNS obtained using the Hypersonic Task-based Research solver
eutyches
Éditer les manuscrits grammaticaux glosés : solutions numériques face aux défis paléographiques Le cas de la tradition manuscrite glosée d’Eutychès grammaticus". Please cite if using any of the models or data.
timeuscorpus
Ground Truth datasets for French 18th and 19th HTR produced by the ANR project TIME US
https://github.com/caltechlibrary/htr-test-cases
Images of documents for testing HTR.
cataloguessegmentationocr
Dataset and models for catalogs' Layout analysis and HTR
documentarist
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
https://github.com/alix-tz/aspyre-gt
A pipeline to transfer ground truth from Transkribus to eScriptorium.
escriptorium-documentation
Source code to eScriptorium Documentation's website (powered with Mkdocs)
metagrapho-tropy
Add transcriptions to items in Tropy using the Transkribus metagrapho API
tapuscorpus
Ground Truth for French 20th century typewritten documents collected on Gallica and Europeana
tnah-2021-projet-notre-dame
Transcription des journaux de l'année 1860 des travaux de restauration de Notre-Dame de Paris effectuée sur eScriptorium
lectaurep-mariages-et-divorces
Lectaurep-Mariages-et-Divorces, ground truth for the Registres des Contrats de Mariages et des Séparations et Divorces (French 19th century)
https://github.com/bluegreen-labs/weahtr_model
HTR/OCR for climate data model development
cremma-wikipedia
A collection of ground truth to train HTR models on contemporary French handwritings