Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.3%) to scientific vocabulary
Repository
Compute complexity metrics from Universal Dependencies
Basic Info
Statistics
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
udstyle
Compute complexity metrics from Universal Dependencies.
Input can be a .conllu file, or a plain text file that will be parsed by Stanza, if installed and language is specified.
Usage: python3 udstyle.py [OPTIONS] FILE...
--parse=LANG parse texts with Stanza; provide 2 letter language code
--output=FILENAME write result to a tab-separated file.
--persentence report per sentence results, not mean per document
Reported metrics:
- LEN: mean sentence length in words (excluding punctuation).
- MDD: mean dependency distance (Gibson, 1998).
- NDD: normalized dependency distance (Lei & Jockers, 2018).
- ADJD: proportion of adjacent dependencies.
- LEFT: dependency direction: proportion of left dependents.
- MOD: nominal modifiers (Biber & Gray, 2010).
- CLS: number of clauses per sentence.
- CLL: average clause length (clauses/words)
- LXD: lexical density: ratio of content words over total number of words
- POS/DEP tag frequencies (only with --output)
Example:
$ python3 udstyle.py UD_Dutch-LassySmall/*.conllu
LEN MDD NDD ADJD LEFT MOD CLS CLL LXD
dev.conllu 14.182 2.461 0.926 0.500 0.459 0.052 2.223 9.190 0.603
test.conllu 11.434 2.192 0.807 0.547 0.412 0.074 1.771 9.013 0.657
train.conllu 11.027 2.172 0.775 0.564 0.391 0.072 1.863 8.107 0.645
$ python3 udstyle.py --parse=nl troonrede.txt
[...]
References
Simple readability metrics: https://github.com/andreasvc/readability/
If you use this code for research, please cite this repository.
Owner
- Name: Andreas van Cranenburgh
- Login: andreasvc
- Kind: user
- Location: Groningen
- Website: http://andreasvc.github.io
- Repositories: 20
- Profile: https://github.com/andreasvc
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: udstyle
message: >-
Please cite this software using the metadata from
'preferred-citation'.
type: software
authors:
- orcid: 'https://orcid.org/0000-0002-4545-1548'
given-names: Andreas
name-particle: van
family-names: Cranenburgh
email: a.w.van.cranenburgh@rug.nl
affiliation: University of Groningen
identifiers:
- type: url
value: 'https://github.com/andreasvc/udstyle'
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2