concepticon-data

The curation repository for the data behind Concepticon.

https://github.com/concepticon/concepticon-data

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    13 of 50 committers (26.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.9%) to scientific vocabulary

Keywords

concepts cross-linguistic-data linguistics

Keywords from Contributors

cryptocurrencies mqtt-publisher notification transformation phylogenetics diffusion cldf cognates2nexus comparative-linguistics dataset-interface
Last synced: 6 months ago · JSON representation

Repository

The curation repository for the data behind Concepticon.

Basic Info
Statistics
  • Stars: 39
  • Watchers: 13
  • Forks: 38
  • Open Issues: 135
  • Releases: 22
Topics
concepts cross-linguistic-data linguistics
Created almost 11 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing Zenodo

README.md

Concepticon data curation

Build Status

The data underlying the Concepticon is maintained in this repository. Released versions of this data are distributed as CLDF datasets, uploaded to Zenodo from the concepticon-cldf repository

Here, you can find

Concepticon Data

  • For an overview on the status of all currently linked conceptlists, see here.
  • For information on how you can contribute to the project or profit from the data sources we offer, see here.

Data Structure

  • conceptlists/ folder contains conceptlists with links to IDs in concepticon.tsv, the lists are named after the first person who proposed them, the year of the reference publication in which we extracted them, and the number of concepts. All these three parts of information are separated by a dash. Furthermore, in cases where two lists would have an identical name, we add alphabetical letters to the lists to distinguish them. Files need to have the columns "GLOSS" (some still have "ENGLISH" instead, but this needs to be changed), additionally, most (if not all files) have a "NUMBER" field indicating the number in the reference, which is also important for ordering the entries as given in the original source. Additional columns are more or less free to the user, but we tried to be consistent.

Some concept lists are based on sources that may change, thus require a mechanism for re-creation. In this case, there will a directory named after the list, containing the relevant curation scripts.

Concept lists may contain information about relations between concepts. If so, such relations must be stored as content of columns named LINKED|SOURCE|TARGET_CONCEPTS. The values for these columns must be - lists of edge objects, where - the concept described in the same row is assumed to be one node of the edge, - the second node is specified via a property ID the value of which must be a concept identifier in the list, - serialized as JSON.

Edges in the graph described in LINKED_CONCEPTS are considered undirected, whereas edges in SOURCE|TARGET_CONCEPTS are considered directed, with the concepts specified in the edge objects identifying the SOURCE or TARGET, respectively, of the edge. - conceptlists.tsv contains metadata about the lists in conceptlists/. - references/references.bib the bibtex file showing links to all concept lists (bibtex-key identical to the name of the conceptlist file, without file-ending. File further contains links to the references in which the conceptlists were published (references stored in the "crossref" field). - sources/ contains pdf-files of each conceptlist (only the list-parts, not the full publications for copyright reasons), naming is the same as for the conceptlists, but with the ending ".pdf" instead of ".tsv". - concepticon.tsv the backbone concept list. All concepts from individual concept lists are linked to entries in this file. - app/ contains data for running the JavaScript-based Concepticon lookup tool.

Norms, Ratings and Relations associated with words and concepts

Before release 3.0, this repository contained metadata linked to Concepticon concept sets. With release 3.0, this data moved to a separate (though related) project - NoRaRe. For the curation and publication workflow of NoRaRe data see https://github.com/concepticon

Update policy

We try to release concepticon-data (as well as the CLDF dataset and the concepticon web app) regularly at least once a year. Generally, new releases should only become more comprehensive, i.e. all data ever released should also be part of the newest release. Occasionally, though, we may have to correct an erratum, which may result in some data being removed, or changes in identifiers of objects. So whenever a link to the web app breaks or a script using the concepticon-data API throws an error, you should consult the list of errata to see, whether an error correction may be the reason for this behaviour.

pyconcepticon

pyconcepticon provides a Python package to programmatically access Concepticon data.

Owner

  • Name: Concepticon
  • Login: concepticon
  • Kind: organization
  • Email: concepticon@eva.mpg.de

A Resource for the Linking of Concept Lists

GitHub Events

Total
  • Create event: 19
  • Release event: 2
  • Issues event: 49
  • Watch event: 7
  • Delete event: 18
  • Member event: 1
  • Issue comment event: 136
  • Push event: 56
  • Pull request review comment event: 141
  • Pull request review event: 121
  • Pull request event: 47
  • Fork event: 2
Last Year
  • Create event: 19
  • Release event: 2
  • Issues event: 49
  • Watch event: 7
  • Delete event: 18
  • Member event: 1
  • Issue comment event: 136
  • Push event: 56
  • Pull request review comment event: 141
  • Pull request review event: 121
  • Pull request event: 47
  • Fork event: 2

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 2,153
  • Total Committers: 50
  • Avg Commits per committer: 43.06
  • Development Distribution Score (DDS): 0.85
Past Year
  • Commits: 157
  • Committers: 11
  • Avg Commits per committer: 14.273
  • Development Distribution Score (DDS): 0.65
Top Committers
Name Email Commits
LinguList m****t@u****e 324
Annika a****a@g****m 239
xrotwang x****g@g****m 219
lingulist m****t@l****g 180
SimonGreenhill s****n@s****z 152
schweikhard y****u@e****m 121
Christoph Rzymski c****h@f****t 108
Mathilda van Zantwijk m****k@e****e 79
MuffinLinwist c****7@g****m 64
MacyL w****n@g****m 53
Kristina-Pianykh 5****h 50
Kristina-Pianykh p****h@s****e 47
Simon J Greenhill S****l 44
lingulist m****t@l****e 42
MottaAM 9****M 41
CarolinHu 5****u 40
MeiShinWu M****L 32
Nathanael E. Schweikhard 3****d 31
Johann-Mattis List L****t 31
martino-vic 5****c 28
Frederic Blum f****m@e****e 28
Tiago Tresoldi t****i@g****m 23
ilchec 18
Tiago Tresoldi t****i@s****e 14
LinguList m****t@p****e 13
marthuis a****2@g****m 13
Gereon Kaiping g****g@h****l 12
natalia-morozova 4****a 12
Frederic Blum f****m@h****e 9
MacyL r****a@g****m 9
and 20 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 68
  • Total pull requests: 68
  • Average time to close issues: over 1 year
  • Average time to close pull requests: 9 days
  • Total issue authors: 13
  • Total pull request authors: 14
  • Average comments per issue: 0.79
  • Average comments per pull request: 2.93
  • Merged pull requests: 45
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 36
  • Pull requests: 38
  • Average time to close issues: 14 days
  • Average time to close pull requests: 4 days
  • Issue authors: 9
  • Pull request authors: 12
  • Average comments per issue: 0.47
  • Average comments per pull request: 2.37
  • Merged pull requests: 21
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • LinguList (15)
  • MiraAhmedovic (13)
  • alzkuc (11)
  • AnnikaTjuka (9)
  • bgo-eiu (6)
  • MuffinLinwist (4)
  • FredericBlum (3)
  • xrotwang (2)
  • henrik65 (1)
  • AHR-09 (1)
  • eva-dlce-zenodo (1)
  • HMRLKE (1)
  • chrzyki (1)
Pull Request Authors
  • alzkuc (16)
  • MiraAhmedovic (10)
  • chrzyki (7)
  • AnnikaTjuka (6)
  • MuffinLinwist (6)
  • LinguList (5)
  • xrotwang (5)
  • SimonGreenhill (3)
  • FredericBlum (3)
  • dependabot[bot] (2)
  • bgo-eiu (2)
  • mathildavz (1)
  • arubehn (1)
  • patkaiist (1)
Top Labels
Issue Labels
new concept list (6) object naming (4) NoRaRe (3) concept linking problems (3) students (3) errata (1) documentation (1) question (1) representation (1)
Pull Request Labels
object naming (4) dependencies (2) new concept list (1)

Dependencies

.github/workflows/concepticon-validation.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite