vrosenberg1853-numeral

Cross-Linguistic Data Format (CLDF) dataset derived from von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners" from 1853 for the comparative numeral data (p. 434). It is another practice session with CLDF to handle/test multple languages.

https://github.com/complexico/vrosenberg1853-numeral

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.2%) to scientific vocabulary

Keywords

barrier-islands cldf comparative-word-list complexico cross-linguistic-data-format enggano indonesia lexibank1 mentawai nias numeral-word-list sumatran sumatran-language
Last synced: 6 months ago · JSON representation ·

Repository

Cross-Linguistic Data Format (CLDF) dataset derived from von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners" from 1853 for the comparative numeral data (p. 434). It is another practice session with CLDF to handle/test multple languages.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 1
  • Releases: 2
Topics
barrier-islands cldf comparative-word-list complexico cross-linguistic-data-format enggano indonesia lexibank1 mentawai nias numeral-word-list sumatran sumatran-language
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation Zenodo

README.md

CLDF dataset derived from von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners" from 1853 for comparative numeral data

How to cite

If you use these data please cite - the original source

von Rosenberg, Carl Benjamin Hermann. 1853. De Mentawei-Eilanden en Hunne Bewoners. Tijdschrift voor Indische Taal-, Land- en Volkenkunde 1. 403–440. - the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a https://creativecommons.org/licenses/by-nc-sa/4.0/ license

Statistics

Glottolog: 100% Concepticon: 100% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 8 (linked to 8 different Glottocodes)
  • Concepts: 10 (linked to 10 different Concepticon concept sets)
  • Lexemes: 80
  • Sources: 1
  • Synonymy: 1.00
  • Invalid lexemes: 0
  • Tokens: 373
  • Segments: 30 (0 BIPA errors, 0 CLTS sound class errors, 30 CLTS modified)
  • Inventory size (avg): 15.38

CLDF Datasets

The following CLDF datasets are available in cldf:

Owner

  • Name: Computer-assisted Lexicology and Lexicography
  • Login: complexico
  • Kind: organization
  • Location: Bali, Indonesia

A research sub-group within the Linguistics strand of @cirhss. Studying words and curating lexical databases using computational and digital tools.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  CLDF dataset derived from von Rosenberg's "De
  Mentawei-Eilanden en Hunne Bewoners" from 1853 for
  comparative numeral data
message: >-
  If you use this dataset, please cite it using the metadata
  from this file.
type: dataset
authors:
  - given-names: Gede Primahadi Wijaya
    family-names: Rajeg
    email: primahadi_wijaya@unud.ac.id
    affiliation: University of Oxford & Udayana University
    orcid: 'https://orcid.org/0000-0002-2047-8621'
repository-code: 'https://github.com/complexico/vrosenberg1853-numeral'
abstract: >-
  Cross-Linguistic Data Format (CLDF) dataset derived from
  von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners"
  from 1853 for the comparative numeral data (p. 434). It is
  a work-in-progress and another practice session with CLDF
  to handle/test multple languages. In this first version
  (v1.0.0), the word forms are still in the original
  orthography and not yet segmented/tokenised. The next
  release attempts to include orthography standardisation
  and segmentation.
keywords:
  - Barrier-Island Languages
  - CLDF
  - Lexical Dataset
  - Old Word List
  - Legacy Materials
  - Numeral
  - Comparative Word List
  - Indonesian Languages
  - Cross-Linguistic Lexical Database
  - Word List
license: CC-BY-NC-SA-4.0
version: 1.0.0
date-released: '2025-02-06'

GitHub Events

Total
  • Create event: 5
  • Issues event: 6
  • Release event: 2
  • Delete event: 2
  • Issue comment event: 4
  • Push event: 9
  • Pull request review comment event: 1
  • Pull request review event: 3
  • Pull request event: 5
Last Year
  • Create event: 5
  • Issues event: 6
  • Release event: 2
  • Delete event: 2
  • Issue comment event: 4
  • Push event: 9
  • Pull request review comment event: 1
  • Pull request review event: 3
  • Pull request event: 5

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 3
  • Average time to close issues: 5 months
  • Average time to close pull requests: 3 minutes
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 2.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 3
  • Average time to close issues: about 11 hours
  • Average time to close pull requests: 3 minutes
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 3.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • engganolang (2)
Pull Request Authors
  • gederajeg (3)
Top Labels
Issue Labels
enhancement (2) bug (1)
Pull Request Labels

Dependencies

.github/workflows/python-package.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
cldf/requirements.txt pypi
  • Babel ==2.12.1
  • Markdown ==3.4.3
  • SQLAlchemy ==1.4.48
  • Unidecode ==1.3.6
  • appdirs ==1.4.4
  • bs4 ==0.0.1
  • cdstarcat ==1.4.0
  • certifi ==2023.5.7
  • chardet ==5.1.0
  • cldfbench ==1.13.0
  • cldfcatalog ==1.5.1
  • cldfzenodo ==1.1.0
  • clldutils ==3.19.0
  • colorama ==0.4.6
  • colorlog ==6.7.0
  • csvw ==3.1.3
  • exceptiongroup ==1.1.1
  • gitdb ==4.0.10
  • html5lib ==1.1
  • idna ==3.4
  • iniconfig ==2.0.0
  • isodate ==0.6.1
  • jsonschema ==4.17.3
  • lingpy ==2.6.13
  • lxml ==4.9.2
  • nameparser ==1.1.2
  • networkx ==3.2.1
  • newick ==1.9.0
  • numpy ==2.0.0
  • openpyxl ==3.1.5
  • packaging ==23.1
  • pluggy ==1.0.0
  • purl ==1.6
  • pybtex ==0.24.0
  • pycdstar ==1.1.0
  • pycldf ==1.34.1
  • pyclts ==3.1.1
  • pyconcepticon ==3.0.0
  • pycountry ==22.3.5
  • pydictionaria ==2.2
  • pyglottolog ==3.11.0
  • pylatexenc ==2.10
  • pylexibank ==3.5.0
  • pyrsistent ==0.19.3
  • pytest ==7.3.1
  • python-dateutil ==2.8.2
  • pytz ==2024.1
  • rdflib ==6.3.2
  • regex ==2023.5.5
  • requests ==2.31.0
  • rfc3986 ==1.5.0
  • segments ==2.2.1
  • six ==1.16.0
  • smmap ==5.0.0
  • soupsieve ==2.4.1
  • tabulate ==0.9.0
  • termcolor ==2.3.0
  • tqdm ==4.65.0
  • uritemplate ==4.1.1
  • urllib3 ==1.26.6
  • wcwidth ==0.2.13
  • webencodings ==0.5.1
  • xlrd ==2.0.1
  • zenodoclient ==0.5.0
setup.py pypi
  • cldfbench *