2024--medieval-czech

Ground truth transcriptions from the 1488 Prague Bible, manually transcribed during the 2024 HTR Winter School, based on Old Czech texts.

https://github.com/htr-school-vienna/2024--medieval-czech

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Ground truth transcriptions from the 1488 Prague Bible, manually transcribed during the 2024 HTR Winter School, based on Old Czech texts.

Basic Info
  • Host: GitHub
  • Owner: HTR-School-Vienna
  • License: cc-by-sa-4.0
  • Language: CSS
  • Default Branch: main
  • Homepage:
  • Size: 995 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

HTR Winter School 2024 - Medieval Czech - Prague Bible (1488)

Short description of the record: Grount Truth from print Ink 13.C.5, manually transcribed.

ATTENTION: To clone this repo you need to have Git LFS installed and then clone the repository like this:

git lfs clone git@github.com:htr-school-vienna/[your_repository_name].git

Description

The Prague Bible (1488, Vienna, Österreichische Nationalbibliothek, shelfmark Ink 13.C.5, available from: http://data.onb.ac.at/rec/AC07537625, Old Czech) Print: Old Czech, Bastarda, end of the 15th C.

Origin of the data

Source of images: ÖNB (https://digital.onb.ac.at/RepViewer/viewer.faces?doc=DTL_547662&order=1&view=SINGLE).

The transcription rules were based on semi-diplomatic transcription rules set by Pero OCR and Směrnice pro vydávání starších českých textů by Jiří Daňhelka (https://vokabular.ujc.cas.cz/moduly/edicnipoznamka.aspx?id=DanhelkaSmernice).

Data organisation

Selection of semi-diplomatic transcribed texts from the so-called Prague Bible (1488). Texts were transcribed by the participants of the HTR Winter School 2024 in Vienna:

Martin Plechatý: 7v–9v, Genesis, Gn 2:15–8:17 Daniel Katscher: 103r–105r, Joshua, Jos 7:13–11:3 Marie Hedvíková: 246r–247v, Judith, Jdt 4:14–7:28 Jan Škvrňák: 279r–281r, Psalms, Ps 31:10–39:6 Martina Spěváčková: 416r–417v, Ezechiel, Ez 22:22–25:2 Jan Švarc: 517v–519v, Gospel of Luke, L 11:51–16:4 Tereza Hejdová: 570r–571v, Hebrews, Hb 9:8–13:4 Václav Steiner: 598v–599r, Apocalypsis, Ap 10:1–16:16;

and corrected by Anna Michalcová.

Biblical abbreviations are according to the usage of the Old Czech dictionary (https://sources.cms.flu.cas.cz/src/index.php?s=v&bookid=1226&page=119). The transcription rules were based on semi-diplomatic transcription rules set by Pero OCR and Směrnice pro vydávání starších českých textů by Jiří Daňhelka (https://vokabular.ujc.cas.cz/moduly/edicnipoznamka.aspx?id=DanhelkaSmernice). Number of transcribed pages is listed above.

How to cite

This dataset was created by Marie Hedvíková, Tereza Hejdová, Daniel Katscher, Anna Michalcová, Martin Plechatý, Martina Spěváčková, Václav Steiner, Jan Škvrňák, Jan Švarc. The digitisation is not copyright free, but the transcription is. However, properly annotating a corpus takes time and is a task that should be recognised. If you use any item from this corpus as ground truth, cite the dataset using the following information <!-- Copy citation BibTeX from Zenodo 10.5281/zenodo.10589561 -->

Copyright and licence

This dataset was created as part of the Winter School of Handwritten Text Recognition of Medieval Manuscripts 2024, Vienna at the Österreichische Akademie der Wissenschaften, Institut für Mittelalterforschung, all transcriptions are licensed under the Creative Commons 4 licence. Images were provided by the Austrian National Library (ÖNB) and are licensed under Creative Commons 4 licence.

Owner

  • Name: HTR School Vienna
  • Login: HTR-School-Vienna
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  HTR Winter School 2024 - Medieval Czech - Prague Bible
  (1488)
message: >-
  If you use this dataset, please cite it using the metadata
  from this file.
type: dataset
authors:
  - family-names: Plechatý
    given-names: Martin
    email: martin-plechaty@seznam.cz
    affiliation: University of West Bohemia
    orcid: 'https://orcid.org/0009-0000-3305-2075'
  - given-names: 'Václav '
    family-names: Steiner
    email: vstein07@students.zcu.cz
    affiliation: University of West Bohemia
    orcid: 'https://orcid.org/0009-0004-8336-9846'
  - given-names: Jan
    family-names: Švarc
    email: jansvarc2000@gmail.com
    affiliation: University of West Bohemia
    orcid: 'https://orcid.org/0009-0005-1274-0545'
  - given-names: Hedvíková
    family-names: Marie
    email: marie.hedvikova@gmail.com
    affiliation: Charles University
    orcid: 'https://orcid.org/0009-0008-3693-6288'
  - given-names: Anna
    family-names: Michalcová
    orcid: 'https://orcid.org/0000-0003-4760-6950'
    email: a.michalcova@ujc.cas.cz
    affiliation: Czech Language Institute
  - family-names: Hejdová
    given-names: Tereza
    email: hejdovat@gmail.com
    affiliation: Czech Language Institute
    orcid: 'https://orcid.org/0000-0003-3998-8880'
  - given-names: Martina
    family-names: Spěváčková
    email: spevacko@gapps.zcu.cz
    affiliation: University of West Bohemia
    orcid: 'https://orcid.org/0000-0002-9357-4614'
  - given-names: Daniel
    family-names: Katscher
    email: daniel.katscher.wien@gmail.com
    affiliation: University of Vienna
    orcid: 'https://orcid.org/0009-0008-3475-2522'
  - given-names: Jan
    family-names: Škvrňák
    email: jan.skvrnak@gmail.com
    affiliation: Charles University
    orcid: 'https://orcid.org/0000-0003-0985-4144'
abstract: >
  The Prague Bible (1488, Vienna, Österreichische
  Nationalbibliothek, shelfmark Ink 13.C.5, available from:
  http://data.onb.ac.at/rec/AC07537625, Old Czech)

  Print: Old Czech, Bastarda, end of the 15th C.
keywords:
  - Medieval Czech
  - Prague Bible (1488)
license: CC-BY-SA-4.0
date-released: '2024-12-19'

GitHub Events

Total
  • Member event: 1
  • Push event: 10
  • Create event: 2
Last Year
  • Member event: 1
  • Push event: 10
  • Create event: 2