vrosenberg1878

Digitised comparative word list derived from von Rosenberg's "Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien" from 1878.

https://github.com/complexico/vrosenberg1878

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 11 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary

Keywords

barrier-island-languages-indonesia barrier-island-sumatra carl-hermann-von-rosenberg comparative-word-list complexico endangered-languages enggano enggano-lexical-database indonesia indonesian-archipelago indonesian-language indonesian-languages lexical-database lexicography linguistics regional-language von-rosenberg word-list
Last synced: 6 months ago · JSON representation ·

Repository

Digitised comparative word list derived from von Rosenberg's "Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien" from 1878.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 1
  • Open Issues: 3
  • Releases: 1
Topics
barrier-island-languages-indonesia barrier-island-sumatra carl-hermann-von-rosenberg comparative-word-list complexico endangered-languages enggano enggano-lexical-database indonesia indonesian-archipelago indonesian-language indonesian-languages lexical-database lexicography linguistics regional-language von-rosenberg word-list
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.Rmd

---
output: github_document
author: 'Gede Primahadi Wijaya Rajeg ORCID iD icon & Daniel Krauße ORCID iD icon
' title: "Digitised comparative word list derived from von Rosenberg's \"Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien\" from 1878." bibliography: references.bib csl: "https://raw.githubusercontent.com/engganolang/kahler-1987/refs/heads/main/unified-style-sheet-for-linguistics.csl" link-citations: true --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` [![The University of Oxford](file-oxweb-logo.gif){width="84"}](https://www.ox.ac.uk/) [![Faculty of Linguistics, Philology and Phonetics, the University of Oxford](file-lingphil.png){width="83"}](https://www.ling-phil.ox.ac.uk/) [![Arts and Humanities Research Council (AHRC)](file-ahrc.png){width="325"}](https://www.ukri.org/councils/ahrc/)
*This work is part of the [AHRC-funded project](https://app.dimensions.ai/details/grant/grant.12915105) on the lexical resources for Enggano, led by the Faculty of Linguistics, Philology and Phonetics at the University of Oxford, UK. Visit the [central webpage of the Enggano project](https://enggano.ling-phil.ox.ac.uk/)*.

This work is licensed under Creative Commons Attribution-NonCommercial 4.0 International

[![DOI](https://img.shields.io/badge/doi-10.25446/oxford.28323353.v1-blue.svg?style=flat&labelColor=whitesmoke&logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAB8AAAAfCAYAAAAfrhY5AAAJsklEQVR42qWXd1DTaRrHf%2BiB2Hdt5zhrAUKz4IKEYu9IGiGFFJJQ0gkJCAKiWFDWBRdFhCQUF3UVdeVcRQEBxUI3yY9iEnQHb3bdW1fPubnyz%2F11M7lvEHfOQee2ZOYzPyDv%2B3yf9%2Fk95YX4fx%2BltfUt08GcFEuPR4U9hDDZ%2FVngIlhb%2FSiI6InkTgLzgDcgfvtnovhH4BzoVlrbwr55QnhCtBW4QHXnFrZbPBaQoBh4%2FSYH2EnpBEtqcDMVzB93wA%2F8AFwa23XFGcc8CkT3mxz%2BfXWtq9T9IQlLIXYEuHojudb%2BCM7Hgdq8ydi%2FAHiBXyY%2BLjwFlAEnS6Jnar%2FvnQVhvdzasad0eKvWZKe8hvDB2ofLZ%2FZEcWsh%2BhyIuyO5Bxs2iZIE4nRv7NWAb0EO8AC%2FWPxjYAWuOEX2MSXZVgPxzmRL3xKz3ScGpx6p6QnOx4mDIFqO0w6Q4fEhO5IzwxlSwyD2FYHzwAW%2BAZ4fEsf74gCumykwNHskLM7taQxLYjjIyy8MUtraGhTWdkfhkFJqtvuVl%2F9l2ZquDfEyrH8B0W06nnpH3JtIyRGpH1iJ6SfxDIHjRXHJmdQjLpfHeN54gnfFx4W9QRnovx%2FN20aXZeTD2J84hn3%2BqoF2Tqr14VqTPUCIcP%2B5%2Fly4qC%2BUL3sYxSvNj1NwsVYPsWdMUfomsdkYm3Tj0nbV0N1wRKwFe1MgKACDIBdMAhPE%2FwicwNWxll8Ag40w%2BFfhibJkGHmutjYeQ8gVlaN%2BjO51nDysa9TwNUFMqaGbKdRJZFfOJSp6mkRKsv0rRIpEVWjAvyFkxNOEpwvcAVPfEe%2Bl8ojeNTx3nXLBcWRrYGxSRjDEk0VlpxYrbe1ZmaQ5xuT0u3r%2B2qe5j0J5uytiZPGsRL2Jm32AldpxPUNJ3jmmsN4x62z1cXrbedXBQf2yvIFCeZrtyicZZG2U2nrrBJzYorI2EXLrvTfCSB43s41PKEvbZDEfQby6L4JTj%2FfIwam%2B4%2BwucBu%2BDgNK05Nle1rSt9HvR%2FKPC4U6LTfvUIaip1mjIa8fPzykii23h2eanT57zQ7fsyYH5QjywwlooAUcAdOh5QumgTHx6aAO7%2FL52eaQNEShrxfhL6albEDmfhGflrsT4tps8gTHNOJbeDeBlt0WJWDHSgxs6cW6lQqyg1FpD5ZVDfhn1HYFF1y4Eiaqa18pQf3zzYMBhcanlBjYfgWNayAf%2FASOgklu8bmgD7hADrk4cRlOL7NSOewEcbqSmaivT33QuFdHXj5sdvjlN5yMDrAECmdgDWG2L8P%2BAKLs9ZLZ7dJda%2BB4Xl84t7QvnKfvpXJv9obz2KgK8dXyqISyV0sXGZ0U47hOA%2FAiigbEMECJxC9aoKp86re5O5prxOlHkcksutSQJzxZRlPZmrOKhsQBF5zEZKybUC0vVjG8PqOnhOq46qyDTDnj5gZBriWCk4DvXrudQnXQmnXblebhAC2cCB6zIbM4PYgGl0elPSgIf3iFEA21aLdHYLHUQuVkpgi02SxFdrG862Y8ymYGMvXDzUmiX8DS5vKZyZlGmsSgQqfLub5RyLNS4zfDiZc9Edzh%2FtCE%2BX8j9k%2FqWB071rcZyMImne1SLkL4GRw4UPHMV3jjwEYpPG5uW5fAEot0aTSJnsGAwHJi2nvF1Y5OIqWziVCQd5NT7t6Q8guOSpgS%2Fa1dSRn8JGGaCD3BPXDyQRG4Bqhu8XrgAp0yy8DMSvvyVXDgJcJTcr1wQ2BvFKf65jqhvmxXUuDpGBlRvV36XvGjQzLi8KAKT2lYOnmxQPGorURSV0NhyTIuIyqOmKTMhQ%2BieEsgOgpc4KBbfDM4B3SIgFljvfHF6cef7qpyLBXAiQcXvg5l3Iunp%2FWv4dH6qFziO%2BL9PbrimQ9RY6MQphEfGUpOmma7KkGzuS8sPUFnCtIYcKCaI9EXo4HlQLgGrBjbiK5EqMj2AKWt9QWcIFMtnVvQVDQV9lXJJqdPVtUQpbh6gCI2Ov1nvZts7yYdsnvRgxiWFOtNJcOMVLn1vgptVi6qrNiFOfEjHCDB3J%2BHDLqUB77YgQGwX%2Fb1eYna3hGKdlqJKIyiE4nSbV8VFgxmxR4b5mVkkeUhMgs5YTi4ja2XZ009xJRHdkfwMi%2BfocaancuO7h%2FMlcLOa0V%2FSw6Dq47CumRQAKhgbOP8t%2BMTjuxjJGhXCY6XpmDDFqWlVYbQ1aDJ5Cptdw4oLbf3Ck%2BdWkVP0LpH7s9XLPXI%2FQX8ws%2Bj2In63IcRvOOo%2BTTjiN%2BlssfRsanW%2B3REVKoavBOAPTXABW4AL7e4NygHdpAKBscmlDh9Jysp4wxbnUNna3L3xBvyE1jyrGIkUHaqQMuxhHElV6oj1picvgL1QEuS5PyZTEaivqh5vUCKJqOuIgPFGESns8kyFk7%2FDxyima3cYxi%2FYOQCj%2F%2B9Ms2Ll%2Bhn4FmKnl7JkGXQGDKDAz9rUGL1TIlBpuJr9Be2JjK6qPzyDg495UxXYF7JY1qKimw9jWjF0iV6DRIqE%2B%2FeWG0J2ofmZTk0mLYVd4GLiFCOoKR0Cg727tWq981InYynvCuKW43aXgEjofVbxIqrm0VL76zlH3gQzWP3R3Bv9oXxclrlO7VVtgBRpSP4hMFWJ8BrUSBCJXC07l40X4jWuvtc42ofNCxtlX2JH6bdeojXgTh5TxOBKEyY5wvBE%2BACh8BtOPNPkApjoxi5h%2B%2FFMQQNpWvZaMH7MKFu5Ax8HoCQdmGkJrtnOiLHwD3uS5y8%2F2xTSDrE%2F4PT1yqtt6vGe8ldMBVMEPd6KwqiYECHDlfbvzphcWP%2BJiZuL5swoWQYlS%2Br7Yu5mNUiGD2retxBi9fl6RDGn4Ti9B1oyYy%2BMP5G87D%2FCpRlvdnuy0PY6RC8BzTA40NXqckQ9TaOUDywkYsudxJzPgyDoAWn%2BB6nEFbaVxxC6UXjJiuDkW9TWq7uRBOJocky9iMfUhGpv%2FdQuVVIuGjYqACbXf8aa%2BPeYNIHZsM7l4s5gAQuUAzRUoT51hnH3EWofXf2vkD5HJJ33vwE%2FaEWp36GHr6GpMaH4AAPuqM5eabH%2FhfG9zcCz4nN6cPinuAw6IHwtvyB%2FdO1toZciBaPh25U0ducR2PI3Zl7mokyLWKkSnEDOg1x5fCsJE9EKhH7HwFNhWMGMS7%2BqxyYsbHHRUDUH4I%2FAheQY7wujJNnFUH4KdCju83riuQeHU9WEqNzjsJFuF%2FdTDAZ%2FK7%2F1WaAU%2BAWymT59pVMT4g2AxcwNa0XEBDdBDpAPvgDIH73R25teeuAF5ime2Ul0OUIiG4GpSAEJeYW9wDTf43wfwHgHLKJoPznkwAAAABJRU5ErkJggg%3D%3D)](http://dx.doi.org/10.25446/oxford.28323353.v1) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14780144.svg)](https://doi.org/10.5281/zenodo.14780144) ## How to cite Please cite the source of the data set [@vonrosenberg1878] (if in APA7^th^) and the particular version of this repository [@rajeg_digitised_2024] (in [DataCite](https://support.datacite.org/docs/data-citation)) as follows: > von Rosenberg, C. B. H. (1878). _Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien_. Gustav Weigel. https://hdl.handle.net/2027/mdp.39015065356076 > Rajeg, Gede Primahadi Wijaya; Krauße, Daniel (2024). Digitised comparative word list derived from von Rosenberg's “Der Malayische Archipel: Land und Leute in Schilderungen, gesammelt während eines driessig-jährigen Aufenhaltes in den Kolonien” from 1878. University of Oxford. Dataset. https://doi.org/10.25446/oxford.28323353.v1 For future updates and version of records, please check the [Releases](https://github.com/complexico/vrosenberg1878/releases) page on this GitHub repository and its [Zenodo archive](https://doi.org/10.5281/zenodo.14780144). ## Overview The work in this repository involves XML-tagging the relevant words (with their respective languages and German gloss) in the unstructured OCR output. The tagging is used to processed the OCR into a [tibble/table](https://github.com/complexico/vrosenberg1878/blob/main/data/vrosenberg1878.tsv) with an [R scripts](https://github.com/complexico/vrosenberg1878/blob/main/pre-processing.R). The comparative word list in von Rosenberg [-@vonrosenberg1878] includes words from the Enggano language and they are included in the Shiny app of the [*EnoLEX*](https://enggano.shinyapps.io/enolex/) online database [@krause_enolex_2024; @rajeg_enolex_2024]. The list of other languages (and their corresponding Glottocodes as laid out [here](https://glottolog.org/resource/reference/id/112913)) can be seen in the map below. The column `OldFormOrig` in the [table](https://github.com/complexico/vrosenberg1878/blob/main/data/vrosenberg1878.tsv) contains the original form/spelling in the source text while the `OldFormChange` contains the changes made (e.g., typo correction, adjustment, OCR error fixing) on the original form/spelling. The `English` and `Indonesian` columns are translations in the two languages of the original German glosses of the forms. The translation was performed using the DeepL web translator. The grouping of the individual language (captured in [this code line](https://github.com/complexico/vrosenberg1878/blob/main/pre-processing.R#L42)) into its larger group is based on the mapping [here](https://babel.hathitrust.org/cgi/pt?id=mdp.39015065356076&seq=637); in that mapping page, the _Kei-Inslen_ appears not to be linked to a group but, based on the description [here](https://babel.hathitrust.org/cgi/pt?id=mdp.39015065356076&seq=355&q1=Kei-Inseln) (cf. from the third sentence of the first paragraph, of the sub-section titled _Geographische Uebersicht_), it is closer to the _Südoster-Inseln_. ```{r, echo=FALSE} #| label: fig-map knitr::include_graphics("img/language-map.png") ``` ## Contributors Name | GitHub user | Description | Role --- | --- | --- | --- Rajeg, Gede Primahadi Wijaya | gederajeg | Data Curator, XML-tagging, Software, Archiving | Author Krauße, Daniel | | Data Curator | Author ## References

Owner

  • Name: Computer-assisted Lexicology and Lexicography
  • Login: complexico
  • Kind: organization
  • Location: Bali, Indonesia

A research sub-group within the Linguistics strand of @cirhss. Studying words and curating lexical databases using computational and digital tools.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  Digitised comparative word list derived from von
  Rosenberg's “Der Malayische Archipel: Land und Leute in
  Schilderungen, gesammelt während eines driessig-jährigen
  Aufenhaltes in den Kolonien” from 1878.
message: >-
  If you use this dataset, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Gede Primahadi Wijaya
    family-names: Rajeg
    email: primahadi_wijaya@unud.ac.id
    affiliation: University of Oxford
    orcid: 'https://orcid.org/0000-0002-2047-8621'
  - given-names: Daniel
    family-names: Krauße
    email: krausse.daniel@gmail.com
    orcid: 'https://orcid.org/0000-0002-9340-6960'
    affiliation: Centre National de la Recherche Scientifique
identifiers:
  - type: doi
    value: 10.5281/zenodo.14780144
    description: Zenodo
repository-code: 'https://github.com/complexico/vrosenberg1878'
repository: 'https://doi.org/10.5281/zenodo.14780144'
keywords:
  - lexical database
  - comparative wordlist
  - old wordlist
  - legacy materials
  - Indonesian languages
  - Carl Herman von Rosenberg
  - CompLexico
  - Enggano
  - Enggano language
  - Barrier island languages
  - Bahasa Enggano
  - lexicology
  - lexicography
  - digital humanities
license: CC-BY-NC-4.0
version: 1.0.0
date-released: '2024-12-28'

GitHub Events

Total
  • Create event: 3
  • Release event: 1
  • Issues event: 2
  • Delete event: 1
  • Push event: 24
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 21
Last Year
  • Create event: 3
  • Release event: 1
  • Issues event: 2
  • Delete event: 1
  • Push event: 24
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 21

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 9
  • Average time to close issues: N/A
  • Average time to close pull requests: 16 minutes
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 9
  • Average time to close issues: N/A
  • Average time to close pull requests: 16 minutes
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • gederajeg (3)
Pull Request Authors
  • gederajeg (13)
  • engganolang (2)
Top Labels
Issue Labels
bug (1)
Pull Request Labels
bug (1) documentation (1) enhancement (1)