Scientific Software
Updated 6 months ago
ms3
ms3: A parser for MuseScore files, serving as data factory for annotated music corpora - Published in JOSS (2023)
Scientific Software · Peer-reviewed
Updated 4 months ago
https://github.com/dcavar/antisemitismdatathon2020
This is project material for the Antisemitism Datathon and Hackathon 2020 at Indiana University
Updated 6 months ago
artlang-dani-el
Тексты и описание грамматики языка ко дню рождения М. А. Даниэля
Updated 5 months ago
https://github.com/clarin-eric/pressmint
PressMint: Interoperable Corpora of Historical Newspapers
Scientific Software
Updated 6 months ago
BenchmarkDataNLP.jl
BenchmarkDataNLP.jl: Synthetic Data Generation for NLP Benchmarking - Published in JOSS (2025)
Scientific Software · Peer-reviewed
Updated 5 months ago
https://github.com/complexico/verb-noun-assoc-corpus-experiment
Repository of data and results for an undergraduate thesis titled "A Corpus-Based Study to Triangulating Experimental Evidence Regarding Verb-Noun Association for Action Verbs" by I Gede Semara Dharma Putra.
Updated 6 months ago
most-different-text-selection
Use embedding data from LLMs to determine the most different text in a given corpus.