wurzburg-glosses-extraction
Extracts lemmata, forms and loci from the Würzburg glosses
https://github.com/centrefordigitalhumanities/wurzburg-glosses-extraction
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary
Repository
Extracts lemmata, forms and loci from the Würzburg glosses
Basic Info
- Host: GitHub
- Owner: CentreForDigitalHumanities
- License: bsd-3-clause
- Language: Python
- Default Branch: master
- Size: 73.2 KB
Statistics
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Würzburg glosses extraction
Allows one to extract grammatical information on glosses from the Würzburg glosses lexicon (Kavanagh 2001).
Usage
Starting from the PDF version of the lexicon, one can use pdf2html.py to convert to HTML (uses PDFMiner), and then cleanhtml.py to remove tags (uses BeautifulSoup).
After preprocessing, run.py allows one to extract grammatical information out of the lexicon.
Web application
This project comes with a small web application (build in Flask) that allows you to
run the extraction for a single gloss, in case something went wrong during the automatic phase. The web application
can be started by running web.py.
Licence
This work is shared under a BSD 3-Clause licence. See LICENSE for more information.
Citation
To cite this repository, please use the metadata provided in CITATION.cff.
Contact
Würzburg glosses extraction is developed by Martijn van der Klis and the Research Software Lab at the Centre for Digital Humanities, Utrecht University.
For questions or suggestions, contact the Centre for Digital Humanities or open an issue in this respository.
References
Kavanagh, Seamus (2001). A lexicon of the Old Irish glosses in the Würzburg Manuscript of the Epistles of St. Paul. Edited by Dagmar S. Wodtko. Österreichische Akademie der Wissenschaften.
Owner
- Name: Centre for Digital Humanities
- Login: CentreForDigitalHumanities
- Kind: organization
- Email: cdh@uu.nl
- Location: Netherlands
- Website: https://cdh.uu.nl/
- Repositories: 39
- Profile: https://github.com/CentreForDigitalHumanities
Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Würzburg glosses extraction
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Martijn
name-particle: van der
family-names: Klis
- email: cdh@uu.nl
name: >-
Research Software Lab, Centre for Digital Humanities,
Utrecht University
city: Utrecht
website: >-
https://cdh.uu.nl/centre-for-digital-humanities/research-software-lab/
identifiers:
- type: doi
value: 10.5281/zenodo.11072623
repository-code: >-
https://github.com/UUDigitalHumanitieslab/wurzburg-glosses-extraction
abstract: >-
Extracts lemmata, forms and loci from the Würzburg glosses
lexicon (Kavanagh 2001).
license: BSD-3-Clause
commit: 52df050
version: '1.0'
date-released: '2024-04-26'
GitHub Events
Total
- Member event: 2
Last Year
- Member event: 2
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0