thec_eng

The Hoosier Ellipsis Corpus (THEC) - English Sub-corpus (thec_eng)

https://github.com/dcavar/thec_eng

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.1%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

The Hoosier Ellipsis Corpus (THEC) - English Sub-corpus (thec_eng)

Basic Info
  • Host: GitHub
  • Owner: dcavar
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 351 KB
Statistics
  • Stars: 1
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

The Hoosier Ellipsis Corpus (THEC) - English Sub-corpus (thec_eng)

(C) 2024 NLP-Lab

More details about the Hoosier Ellipsis Corpus can be found on the NLP-Lab pages. The GitHub repo contains links to other languages and useful code and scripts for data processing.

This repo contains the English Ellipsis Sub-corpus of THEC.

Consult the data format specification for details about the structure of the files and the annotation standard used.

Maintainers

  • Emily Reed
  • Billy Dickson
  • Muhammed S Abdo
  • Tanmayi Balla
  • Van Holthenrichs
  • Damir Cavar

Citation

Please use the following snippet to cite our work.

```bibtex @inproceedings{cavar-etal-2024-typology, title = "The Typology of Ellipsis: A Corpus for Linguistic Analysis and Machine Learning Applications", author = "Cavar, Damir and Mompelat, Ludovic and Abdo, Muhammad", editor = "Hahn, Michael and Sorokin, Alexey and Kumar, Ritesh and Shcherbakov, Andreas and Otmakhova, Yulia and Yang, Jinrui and Serikov, Oleg and Rani, Priya and Ponti, Edoardo M. and Murado{\u{g}}lu, Saliha and Gao, Rena and Cotterell, Ryan and Vylomova, Ekaterina", booktitle = "Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP", month = mar, year = "2024", address = "St. Julian's, Malta", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.sigtyp-1.6", pages = "46--54" }

@inproceedings{cavar-atal-2004-computing, author = "Cavar, Damir and Zoran Tiganj and Ludovic Mompelat and Billy Dickson", title = {Computing Ellipsis Constructions: Comparing Classical {NLP} and {LLM} Approaches}, booktitle = {2024 Meeting of the Society for Computation in Linguistics (SCiL)}, month = may, year = {2024}, address = {}, publisher = {}, url = {}, pages = "--" } ```

Owner

  • Name: Damir Cavar
  • Login: dcavar
  • Kind: user
  • Location: Bloomington, IN
  • Company: Indiana University

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Cavar"
  given-names: "Damir"
  orcid: "https://orcid.org/0000-0002-1262-5927"
- family-names: "Mompelat"
  given-names: "Ludovic Veta"
- family-names: "Abdo"
  given-names: "Muhammed S"
title: "The Typology of Ellipsis: A Corpus for Linguistic Analysis and Machine Learning Applications"
version: 2.0.4
date-released: 2024-03-22
url: "https://github.com/dcavar/hoosierellipsiscorpus"

GitHub Events

Total
Last Year

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 12
  • Total Committers: 2
  • Avg Commits per committer: 6.0
  • Development Distribution Score (DDS): 0.083
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Damir Cavar d****r@m****m 11
Van Holt v****h@i****u 1
Committer Domains (Top 20 + Academic)
iu.edu: 1 me.com: 1

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • ipydatagrid *
  • pandas *