vrosenberg1878
Digitised comparative word list derived from von Rosenberg's "Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien" from 1878.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 11 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Keywords
Repository
Digitised comparative word list derived from von Rosenberg's "Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien" from 1878.
Basic Info
- Host: GitHub
- Owner: complexico
- License: other
- Language: R
- Default Branch: main
- Homepage: https://doi.org/10.25446/oxford.28323353
- Size: 975 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 3
- Releases: 1
Topics
Metadata Files
README.Rmd
--- output: github_document author: 'Gede Primahadi Wijaya Rajeg& Daniel Krauße
' title: "Digitised comparative word list derived from von Rosenberg's \"Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien\" from 1878." bibliography: references.bib csl: "https://raw.githubusercontent.com/engganolang/kahler-1987/refs/heads/main/unified-style-sheet-for-linguistics.csl" link-citations: true --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` [{width="84"}](https://www.ox.ac.uk/) [{width="83"}](https://www.ling-phil.ox.ac.uk/) [{width="325"}](https://www.ukri.org/councils/ahrc/) *This work is part of the [AHRC-funded project](https://app.dimensions.ai/details/grant/grant.12915105) on the lexical resources for Enggano, led by the Faculty of Linguistics, Philology and Phonetics at the University of Oxford, UK. Visit the [central webpage of the Enggano project](https://enggano.ling-phil.ox.ac.uk/)*.
This work is licensed under Creative Commons Attribution-NonCommercial 4.0 International
[](http://dx.doi.org/10.25446/oxford.28323353.v1) [](https://doi.org/10.5281/zenodo.14780144) ## How to cite Please cite the source of the data set [@vonrosenberg1878] (if in APA7^th^) and the particular version of this repository [@rajeg_digitised_2024] (in [DataCite](https://support.datacite.org/docs/data-citation)) as follows: > von Rosenberg, C. B. H. (1878). _Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien_. Gustav Weigel. https://hdl.handle.net/2027/mdp.39015065356076 > Rajeg, Gede Primahadi Wijaya; Krauße, Daniel (2024). Digitised comparative word list derived from von Rosenberg's “Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien” from 1878. University of Oxford. Dataset. https://doi.org/10.25446/oxford.28323353.v1 For future updates and version of records, please check the [Releases](https://github.com/complexico/vrosenberg1878/releases) page on this GitHub repository and its [Zenodo archive](https://doi.org/10.5281/zenodo.14780144). ## Overview The work in this repository involves XML-tagging the relevant words (with their respective languages and German gloss) in the unstructured OCR output. The tagging is used to processed the OCR into a [tibble/table](https://github.com/complexico/vrosenberg1878/blob/main/data/vrosenberg1878.tsv) with an [R scripts](https://github.com/complexico/vrosenberg1878/blob/main/pre-processing.R). The comparative word list in von Rosenberg [-@vonrosenberg1878] includes words from the Enggano language and they are included in the Shiny app of the [*EnoLEX*](https://enggano.shinyapps.io/enolex/) online database [@krause_enolex_2024; @rajeg_enolex_2024]. The list of other languages (and their corresponding Glottocodes as laid out [here](https://glottolog.org/resource/reference/id/112913)) can be seen in the map below. The column `OldFormOrig` in the [table](https://github.com/complexico/vrosenberg1878/blob/main/data/vrosenberg1878.tsv) contains the original form/spelling in the source text while the `OldFormChange` contains the changes made (e.g., typo correction, adjustment, OCR error fixing) on the original form/spelling. The `English` and `Indonesian` columns are translations in the two languages of the original German glosses of the forms. The translation was performed using the DeepL web translator. The grouping of the individual language (captured in [this code line](https://github.com/complexico/vrosenberg1878/blob/main/pre-processing.R#L42)) into its larger group is based on the mapping [here](https://babel.hathitrust.org/cgi/pt?id=mdp.39015065356076&seq=637); in that mapping page, the _Kei-Inslen_ appears not to be linked to a group but, based on the description [here](https://babel.hathitrust.org/cgi/pt?id=mdp.39015065356076&seq=355&q1=Kei-Inseln) (cf. from the third sentence of the first paragraph, of the sub-section titled _Geographische Uebersicht_), it is closer to the _Südoster-Inseln_. ```{r, echo=FALSE} #| label: fig-map knitr::include_graphics("img/language-map.png") ``` ## Contributors Name | GitHub user | Description | Role --- | --- | --- | --- Rajeg, Gede Primahadi Wijaya | gederajeg | Data Curator, XML-tagging, Software, Archiving | Author Krauße, Daniel | | Data Curator | Author ## References![]()
Owner
- Name: Computer-assisted Lexicology and Lexicography
- Login: complexico
- Kind: organization
- Location: Bali, Indonesia
- Repositories: 1
- Profile: https://github.com/complexico
A research sub-group within the Linguistics strand of @cirhss. Studying words and curating lexical databases using computational and digital tools.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
Digitised comparative word list derived from von
Rosenberg's “Der Malayische Archipel: Land und Leute in
Schilderungen, gesammelt während eines driessig-jährigen
Aufenhaltes in den Kolonien” from 1878.
message: >-
If you use this dataset, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Gede Primahadi Wijaya
family-names: Rajeg
email: primahadi_wijaya@unud.ac.id
affiliation: University of Oxford
orcid: 'https://orcid.org/0000-0002-2047-8621'
- given-names: Daniel
family-names: Krauße
email: krausse.daniel@gmail.com
orcid: 'https://orcid.org/0000-0002-9340-6960'
affiliation: Centre National de la Recherche Scientifique
identifiers:
- type: doi
value: 10.5281/zenodo.14780144
description: Zenodo
repository-code: 'https://github.com/complexico/vrosenberg1878'
repository: 'https://doi.org/10.5281/zenodo.14780144'
keywords:
- lexical database
- comparative wordlist
- old wordlist
- legacy materials
- Indonesian languages
- Carl Herman von Rosenberg
- CompLexico
- Enggano
- Enggano language
- Barrier island languages
- Bahasa Enggano
- lexicology
- lexicography
- digital humanities
license: CC-BY-NC-4.0
version: 1.0.0
date-released: '2024-12-28'
GitHub Events
Total
- Create event: 3
- Release event: 1
- Issues event: 2
- Delete event: 1
- Push event: 24
- Pull request review event: 3
- Pull request review comment event: 3
- Pull request event: 21
Last Year
- Create event: 3
- Release event: 1
- Issues event: 2
- Delete event: 1
- Push event: 24
- Pull request review event: 3
- Pull request review comment event: 3
- Pull request event: 21
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 2
- Total pull requests: 9
- Average time to close issues: N/A
- Average time to close pull requests: 16 minutes
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 9
- Average time to close issues: N/A
- Average time to close pull requests: 16 minutes
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- gederajeg (3)
Pull Request Authors
- gederajeg (13)
- engganolang (2)
