dutch-plurals
List of Dutch lemma/plural pairs for training/evaluating (de)pluralizers
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.6%) to scientific vocabulary
Repository
List of Dutch lemma/plural pairs for training/evaluating (de)pluralizers
Basic Info
- Host: GitHub
- Owner: CentreForDigitalHumanities
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Size: 8.81 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Dutch Plurals and Articles
This repository contains a list of manually annotated/verified Dutch plural singular/lemma combinations. This was done by Henk Pander Maat and can be found under input.tsv.
It also contains a list of articles at gender.tsv.
Using transform.py these files can be converted into output.tsv which is a file which can be used as input for froggen.
Wiktionary
The machine-readable Wiktionary data from Tatu Ylonen was used to expand these list and make an initial file containing the articles.
Considerations for Article Usage
For uncapitalized words an article can often be easily determined by a native speaker. If there is none (for example for months) it is left blank. For capitalized words there is more room for ambiguity. Rivers, lakes, seas, mountains, deserts, inhabitants, streets, squares and languages have articles. Acronyms, cars and devices also generally have them. Countries, cities, regions and brand names generally do not. Nominal adjectives should be recorded as adjectives
Nominal Adjectives
Words which can be used as a nominal adjectives are tagged as 'ADJ' instead of 'N'. If they can also be used as normal nouns those tags will also be added to the output file. For example compare "Het Nederlands is een Germaanse taal." (N) versus "Dat is typisch Nederlands." (ADJ).
Owner
- Name: Centre for Digital Humanities
- Login: CentreForDigitalHumanities
- Kind: organization
- Email: cdh@uu.nl
- Location: Netherlands
- Website: https://cdh.uu.nl/
- Repositories: 39
- Profile: https://github.com/CentreForDigitalHumanities
Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Dutch Plurals
message: >-
If you use this dataset, please cite it using the metadata
from this file.
type: dataset
authors:
- given-names: Henk
family-names: Pander Maat
affiliation: Utrecht University
orcid: 'https://orcid.org/0000-0002-5515-4627'
- given-names: Sheean
family-names: Spoel
email: s.j.j.spoel@uu.nl
affiliation: Utrecht University
orcid: 'https://orcid.org/0000-0002-6802-4135'
repository-code: >-
https://github.com/CentreForDigitalHumanities/dutch-plurals
abstract: >-
A list of manually annotated/verified Dutch plural
singular/lemma combinations. This was done by Henk Pander
Maat and can be found under input.tsv. Using transform.py
this can be converted into output.tsv which is a file
which can be used as input for froggen.
keywords:
- froggen
- Dutch
- plural
- singular
- morphology
license: BSD-3-Clause
GitHub Events
Total
Last Year
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Sheean Spoel | s****l@u****l | 13 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0