https://github.com/alix-tz/aspyre-gt
A pipeline to transfer ground truth from Transkribus to eScriptorium.
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Keywords
Repository
A pipeline to transfer ground truth from Transkribus to eScriptorium.
Basic Info
Statistics
- Stars: 7
- Watchers: 2
- Forks: 0
- Open Issues: 9
- Releases: 3
Topics
Metadata Files
README.md
ASPYRE GT
A converter to help making your data compatible for import in eScriptorium. <!--A pipeline to transfer ground truth from Transkribus to eScriptorium.-->

SUMMARY
How to use Aspyre
As a library
Aspyre is a library. To install it, simply download aspyrelib/ and make sure to install the dependencies! Use from aspyrelib import aspyre to import it in your program.
Parsing parameters with aspyre.AspyreArgs()
Start your project parsing all the required information with AspyreArgs() objects.
python
Process essential information to run Aspyre
:param scenario: keyword describing the scenario (string)
:param source: path to source file (string)
[opt] :param destination: path to output (string)
[opt] :param talkative: activate a few print commands (bool)
[opt] :param vpadding: value to add to VPOS attr. in String nodes (int)
supported values for
scenario: "tkb", "pdfalto", "limb"
vpaddingis only used in PDFALTO and LIMB scenarios
Transkribus to eScriptorium scenario with aspyre.TkbToEs()
:warning: really not the best way to transfer data between these two softwares.
Run Transkribus to eScriptorium (mainly resolve schema declaration, source image information).
python
Handle a Transkribus to eScriptorium transformation scenario
:param args: essential information to run transformation scenario (AspyreArgs)
PDFALTO to eScriptorium scenario with aspyre.PdfaltoToEs()
Run PDFALTO to eScriptorium scenario (mainly resolve schema declaration, source image information and homothety)
python
Handle a PDFALTO to eScriptorium transformation scenario
:param args: essential information to run transformation scenario (AspyreArgs)
As a CLI
A legacy script (run.py) from earlier stage enables you to use Aspyre as a CLI fairly easily.
Step by step (Transkribus scenario)
- Export the transcriptions and the images from Transkribus; you now have a zip file <!--- ~~Unzip the file to a directory you will serve to Aspyre as the location of the sources~~ (unnecessary with Aspyre 0.2.4!)-->
- Create a virtual environment based on Python 3 and install dependencies (cf. requirements.txt)
- Run aspyre/run.py (
python3 aspyre/run.py) with the fitting options - See the CLI's options with --help* (
python3 aspyre/run.py --help) - Aspyre will create a new ZIP that can be loaded onto eScriptorium
Example
python
$ virtualenv venv -p python3
$ source venv/bin/activate
(venv)$ pip install -r requirements.txt
(venv)$ python3 aspyre/run.py -i /path/to/exported/documents
As a service online
This is no longer an option, following Heroku's decision in 2021 to stop supporting free hosting services.
~~You can now access Aspyre as a service online (GUI)! :arrow_right: go to Aspyre GUI~~
~~Step by step (Transkribus scenario)~~
- ~~Export the transcriptions and the images from Transkribus; you now have a zip file~~
- ~~If your archive weighs more than 500 MB, remove the images from the zip file (unzip the archive and rezip it keeping only the alto/ directory and the 'mets.xml' file)~~
- ~~Load the zip file onto the application and download the returned zip file~~
- ~~You can now directly load this new ZIP onto eScriptorium~~
Configuring the export from Transkribus
Export your data checking the “Transkribus Document” format option and checking the “Export ALTO” and “Export Image” sub-options.
Which input from PDFALTO?
Contenu minimum:
python
dossier(.zip)/
- out/
- identifiant.xml_data/
- image-1.png
- identifiant.xml
Pour le moment les archives tar.gz ne sont pas supportées. Seules les archives zip le sont.
Reporting Errors
If you notice unexpected errors or bugs or if you wish to add more complexity to the way Aspyre transforms the ALTO XML files, please create an issue and contribute!
Wiki
Owner
- Name: Alix Chagué
- Login: alix-tz
- Kind: user
- Company: Inria
- Website: http://alix-tz.github.io
- Twitter: Alix_Tz
- Repositories: 10
- Profile: https://github.com/alix-tz
PhD student in Digital Humanities @ Université de Montréal and Inria.
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 18
- Total pull requests: 12
- Average time to close issues: about 1 month
- Average time to close pull requests: 24 days
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 1.39
- Average comments per pull request: 0.17
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 6
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- alix-tz (16)
- JMCarrow (1)
- aethralis (1)
Pull Request Authors
- alix-tz (6)
- dependabot[bot] (6)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Pillow >=6.2.2
- beautifulsoup4 ==4.9.1
- bs4 ==0.0.1
- lxml ==4.9.1
- pylint ==2.4.4
- pylint-fail-under ==0.3.0
- soupsieve ==1.9.6
- termcolor ==1.1.0
- tqdm ==4.48.2
