2023--late-medieval-latin
Short description of the record. Ground truth for ÖNB 3891 manuscript + automatically read pages with Transkribus.
https://github.com/htr-school-vienna/2023--late-medieval-latin
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.9%) to scientific vocabulary
Repository
Short description of the record. Ground truth for ÖNB 3891 manuscript + automatically read pages with Transkribus.
Basic Info
- Host: GitHub
- Owner: HTR-School-Vienna
- License: cc-by-sa-4.0
- Language: CSS
- Default Branch: main
- Homepage: https://htr-school-vienna.github.io/2023--late-medieval-latin/
- Size: 3.18 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
HTR Winter School 2023/2024 - Late Medieval Latin, ONB 3891
Short description of the record. Ground truth for ÖNB 3891 manuscript + automatically read pages with Transkribus.
ATTENTION: To clone this repo you need to have Git LFS installed and then clone the repository like this:
git lfs clone git@github.com:htr-school-vienna/[your_repository_name].git
Description
Sermones by Thomas Ebendorfer (1388-1464) as found in MS Vienna, Austrian National Library (ÖNB), Cod. 3891. Wolfgang Chranekker, an organist in St. Wolfgang, finished the writing in 1441. See the description of the manuscript at Manuscripta.at. Writing: Latin, Bastarda, mid 15th C. Number of files: Number of lines:
Origin of the data:
Source of images: Austrian National Library
Description or citation of transcription guidelines
- expanded abbrevations
- preserved original punctuation
- preserved the original interpunction
- used "/" for virgula
- didn´t add "." at the end of sentences
- used ¬ at the end of the line if a word is divided
- used "v" for consonant and "u" for vocal
- used i for i/j
- used s for ſ/s
- used c/t as in the manuscript
- no capitalization of letters
- preserved "ll" in the place of L, "ff" in the place of "F", etc.
- separated prepositions from words
- wrote words together that ought to be written together
- preserved numbers See Google docs
Data organisation
- How are the data organised in the files (e.g. images in images folder, tei export in tei folder, etc.)?
- If there is a system for naming images and files (and there should be), this is the place to describe it.
- Anything else that might help you understand the structure of the repository
How to cite
This dataset was created by Cehuľová Viktória, Ciuntu Mara-Elena, Engelmaier Leonhard, Kohn Albert, Lukáč Kováčová Magdaléna, Lukáč Labancová Ivana, Mihaljević Ana, Odstrčilík Jan, Roček Martin, Rokpelne Liene, Scalia Andrea, Šaldová Zuzana, Vašíček Andrej, Yücel Fatih, Zelenková Adéla. The digitisation is not copyright free, but the transcription is. However, properly annotating a corpus takes time and is a task that should be recognised. If you use any item from this corpus as ground truth, cite the dataset using the following information
Copy citation BibTeX from Zenodo
Copyright and licence
This dataset was created as part of the Winter School of Handwritten Text Recognition of Medieval Manuscripts 2023/2024, Vienna at the Österreichische Akademie der Wissenschaften, Institut für Mittelalterforschung, all transcriptions are licensed under the Creative Commons 4 licence. Images were provided by the Austrian National Library (ÖNB).
Owner
- Name: HTR School Vienna
- Login: HTR-School-Vienna
- Kind: organization
- Repositories: 1
- Profile: https://github.com/HTR-School-Vienna
Citation (CITATION.cff)
cff-version: 1.12.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Odstrcilik
given-names: Jan
orcid: https://orcid.org/0000-0001-9104-9827
- family-names: Yücel
given-names: Fatih
orcid: https://orcid.org/0000-0003-2591-5154
- family-names: Rokpelne
given-names: Liene
orcid: https://orcid.org/0009-0006-5962-678X
- family-names: Mihaljević
given-names: Ana
orcid: https://orcid.org/0000-0002-1988-4147
- family-names: Šaldová
given-names: Zuzana
orcid: https://orcid.org/0009-0000-9365-7471
- family-names: Zelenková
given-names: Adéla
orcid: https://orcid.org/0009-0007-7742-7722
- family-names: Ciuntu
given-names: Mara
orcid: https://orcid.org/0009-0008-8472-0297
- family-names: Lukáč Kováčová
given-names: Magdaléna
orcid: https://orcid.org/0009-0007-1369-6763
- family-names: Vašíček
given-names: Andrej
orcid: https://orcid.org/0009-0007-5630-7000
- family-names: Cehuľová
given-names: Viktória
orcid: https://orcid.org/0009-0000-3695-1244
- family-names: Lukáč Labancová
given-names: Ivana
orcid: https://orcid.org/0009-0000-2060-1629
- family-names: Engelmaier
given-names: Leonhard
orcid:
- family-names: Scalia
given-names: Andrea
orcid:
- family-names: Kohn
given-names: Albert
orcid:
- family-names: Roček
given-names: Martin
orcid: https://orcid.org/0000-0001-7802-7252
title: "HTR Winter School 2023/2024 - Late Medieval Latin, ONB 3891"
version: 1.0.0
identifiers:
- type: doi
value: 10.5281/zenodo.1234
date-released: 2024-01-30