https://github.com/caltechlibrary/htr-test-cases
Images of documents for testing HTR.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary
Keywords
Repository
Images of documents for testing HTR.
Basic Info
Statistics
- Stars: 1
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Test cases for HTR experiments
This repository contains test images for the Library's studies on handwritten text recognition.
Table of contents
- Introduction
- Installation
- Usage
- Known issues and limitations
- Getting help
- Contributing
- License
- Authors and history
- Acknowledgments
Introduction
The Caltech Library is working on applications of OCR and HTR (handwritten text recognition) to documents stored in the Caltech Archives. The development of software such as Handprint requires test cases in the form of images of documents. This repository holds a collection of such images for the Library's work.
The images are stored in subdirectories that give some indication of their origins and natures; for example, the caltech subdirectory contains images from the Caltech Archives. The sources of individual images are described in associated XML files containing Dublin Core metadata in OAI 2.0 DC format (based on the specification document dated 2015-01-08). There is a separate .xml file for each image file. An XML schema is available elsewhere for the format used to store the Dublin Core metadata.
Installation
There is no software in this repository; it contains only image files, XML files, and text files. You can download the entire set using various methods. One method is to use GitHub's "Download ZIP" link,
https://github.com/caltechlibrary/htr-test-cases/archive/master.zip
in combination with your preferred file download software tool (which could be your browser, or curl, or wget, or similar software). A second method is to use git to clone the repository to your local computer:
sh
git clone https://github.com/caltechlibrary/htr-test-cases.git
Usage
This is a collection of files. You can use them in whatever way you would use other image files.
Known issues and limitations
None at this time.
Getting help
If you find an issue, please submit it in the GitHub issue tracker for this repository.
Contributing
We would be happy to receive your help and participation with enhancing this collection of test images. Please visit the guidelines for contributing for some tips on getting started.
License
Please see the individual image files and subdirectories for applicable copyright and license information.
Authors and history
Mike Hucka started this collection in 2019, with the help of others at the Caltech Library's DLD group, including Tommy Keswick and Peter Collopy.
Acknowledgments
The vector artwork of as a logo for Handprint was created by Alice Design from the Noun Project. It is licensed under the Creative Commons CC-BY 3.0 license. Mike Hucka slightly modified the original icon graphic file to change the color and reformat it for use as this repository's icon.
This work was funded by the California Institute of Technology Library.
Owner
- Name: Caltech Library
- Login: caltechlibrary
- Kind: organization
- Email: helpdesk@library.caltech.edu
- Location: Pasadena, CA 91125
- Website: https://www.library.caltech.edu/
- Repositories: 84
- Profile: https://github.com/caltechlibrary
We manage the physical and digital holdings of the California Institute of Technology, provide services and training, and develop open-source software.
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0