wurzburg-glosses-extraction

Extracts lemmata, forms and loci from the Würzburg glosses

https://github.com/centrefordigitalhumanities/wurzburg-glosses-extraction

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

Extracts lemmata, forms and loci from the Würzburg glosses

Basic Info
  • Host: GitHub
  • Owner: CentreForDigitalHumanities
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: master
  • Size: 73.2 KB
Statistics
  • Stars: 0
  • Watchers: 4
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 10 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Würzburg glosses extraction

DOI

Allows one to extract grammatical information on glosses from the Würzburg glosses lexicon (Kavanagh 2001).

Usage

Starting from the PDF version of the lexicon, one can use pdf2html.py to convert to HTML (uses PDFMiner), and then cleanhtml.py to remove tags (uses BeautifulSoup).

After preprocessing, run.py allows one to extract grammatical information out of the lexicon.

Web application

This project comes with a small web application (build in Flask) that allows you to run the extraction for a single gloss, in case something went wrong during the automatic phase. The web application can be started by running web.py.

Licence

This work is shared under a BSD 3-Clause licence. See LICENSE for more information.

Citation

To cite this repository, please use the metadata provided in CITATION.cff.

Contact

Würzburg glosses extraction is developed by Martijn van der Klis and the Research Software Lab at the Centre for Digital Humanities, Utrecht University.

For questions or suggestions, contact the Centre for Digital Humanities or open an issue in this respository.

References

Kavanagh, Seamus (2001). A lexicon of the Old Irish glosses in the Würzburg Manuscript of the Epistles of St. Paul. Edited by Dagmar S. Wodtko. Österreichische Akademie der Wissenschaften.

Owner

  • Name: Centre for Digital Humanities
  • Login: CentreForDigitalHumanities
  • Kind: organization
  • Email: cdh@uu.nl
  • Location: Netherlands

Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Würzburg glosses extraction
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Martijn
    name-particle: van der
    family-names: Klis
  - email: cdh@uu.nl
    name: >-
      Research Software Lab, Centre for Digital Humanities,
      Utrecht University
    city: Utrecht
    website: >-
      https://cdh.uu.nl/centre-for-digital-humanities/research-software-lab/
identifiers:
  - type: doi
    value: 10.5281/zenodo.11072623
repository-code: >-
  https://github.com/UUDigitalHumanitieslab/wurzburg-glosses-extraction
abstract: >-
  Extracts lemmata, forms and loci from the Würzburg glosses
  lexicon (Kavanagh 2001).
license: BSD-3-Clause
commit: 52df050
version: '1.0'
date-released: '2024-04-26'

GitHub Events

Total
  • Member event: 2
Last Year
  • Member event: 2

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels