Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: sjtu-compling
  • Language: TeX
  • Default Branch: main
  • Size: 2.99 MB
Statistics
  • Stars: 7
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed 9 months ago
Metadata Files
Readme Citation

README.md

MELA: Multilingual Evaluation of Linguistic Acceptability

This repository contains data for the MELA (Multilingual Evaluation of Linguistic Acceptability) benchmark.

Note that to prevent data contamination, we put the data in a zip file with the password: 200240.

News 🔥🔥🔥

MELA is now available in lm evaluation harness.

Now you may evaluate your model on MELA like any other task in the harness:

``` git clone https://github.com/EleutherAI/lm-evaluation-harness cd lm-evaluation-harness pip install -e .

lmeval --model hf --modelargs pretrained=[modelnameorpath] --tasks mela --device cuda:0 --numfewshot 2 --outputpath results --logsamples ```

Some models' results:

|model|shot|reported in the paper|lm eval harness| |-|-|-|-| |BLOOMZ 7B|0|5.85|5.99±0.85 | |BLOOMZ 7B|2|4.31| 4.11±0.87 | |mT0 13B| 0 | 6.62 | 7.72±0.88 | |mT0 13B| 2 | 7.70 | 5.82±0.75 | |mTk 13B |0 | 2.24 | 3.16±1.01 | |mTk 13B |2 | 12.05 | 12.26±0.98 |

Description

MELA contains 46k acceptable and unacceptable sentences from 10 languages: English, Chinese, Italian, Russian, German, French, Spanish, Japanese, Arabic, and Icelandic. Sentences in English, Italian, Russian, and Chinese are consolidated from previous works, which are refered to as high-resource languages. Samples in other languages (i.e., low-resource languages) are sourced from renounced linguistics publications such as syntax textbooks and journal articles. In our paper, we showcase three potential usages of MELA: - Benchmarking LLMs - Cross-lingual transfer - Syntax acquisition

Here are example sentences from MELA:

MELAexamples

Data

Note that we have two versions of data splits: 1) data/v1.0, and 2) data/v1.1. Differences lie in low-resource language (see prompt in prompts.txt).

  • data/v1.0: We used this version to fine-tune XLM-R, we had to keep a training set of 500 samples for each low-resource language.

  • data/v1.1: To better evaluate LLMs, we re-split the data and only keep 100 samples as development set for 6 low-resource languages. Thus, more data can be utilized to evaluate LLMs.

MELA Data Splits

Benchmarking results

We list the results of several multilingual LLMs (evaluated on v1.1) along with fine-tuned XLM-R (evaluated on v1.0). Following previous works in linguistic acceptability, we use MCC as the evaluation metric.

Benchmarking results

Cross-lingual transfer results

To observe the transfer of acceptability judgements across languages, we train the model on one language, and evaluate it on all 10 development sets.

Cross-lingual transfer results

Probing results

We train probing classifiers using span representations from XLM-Rs on English probing tasks, including 1) part-of-speech tagging, 2) dependency labeling, 3) constituency labeling, 4) named entity labeling, 5) semantic role labeling, and 6) co-reference.

Probing results

Contributors

Ziyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu

Citation

If you use our dataset, please cite us plus all other corpora of linguistic acceptability used in MELA (see citations.bib file).

@inproceedings{Zhang2024MELA, author = {Ziyin Zhang and Yikang Liu and Weifang Huang and Junyu Mao and Rui Wang and Hai Hu}, editor = {Lun{-}Wei Ku and Andre Martins and Vivek Srikumar}, title = {{MELA:} Multilingual Evaluation of Linguistic Acceptability}, booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), {ACL} 2024, Bangkok, Thailand, August 11-16, 2024}, pages = {2658--2674}, publisher = {Association for Computational Linguistics}, year = {2024}, url = {https://doi.org/10.18653/v1/2024.acl-long.146}, doi = {10.18653/V1/2024.ACL-LONG.146} }

Owner

  • Name: sjtu-compling
  • Login: sjtu-compling
  • Kind: organization

Citation (citations.bib)

@article{Warstadt2019CoLA,
  title={Neural network acceptability judgments},
  author={Warstadt, Alex and Singh, Amanpreet and Bowman, Samuel R},
  journal={Transactions of the Association for Computational Linguistics},
  volume={7},
  pages={625--641},
  year={2019}
}

@inproceedings{Trotta2021ItaCoLA,
  title={Monolingual and Cross-Lingual Acceptability Judgments with the Italian CoLA corpus},
  author={Trotta, Daniela and Guarasci, Raffaele and Leonardelli, Elisa and Tonelli, Sara},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2021},
  pages={2929--2940},
  year={2021}
}

@inproceedings{Mikhailov2022RuCoLA,
  title={RuCoLA: Russian Corpus of Linguistic Acceptability},
  author={Mikhailov, Vladislav and Shamardina, Tatiana and Ryabinin, Max and Pestova, Alena and Smurov, Ivan and Artemova, Ekaterina},
  booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},
  pages={5207--5227},
  year={2022}
}

@misc{Hu2023CoLAC,
  title={Revisiting Acceptability Judgements},
  author={Hu, Hai and Zhang, Ziyin and Huang, Weifang and Lai, Jackie Yan-Ki and Li, Aini and Ma, Yina and Huang, Jiahui and Zhang, Peng and Wang, Rui},
  year={2023},
  eprint={2305.14091}
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

@misc{Zhang2023MELA,
  title={MELA: Multilingual Evaluation of Linguistic Acceptability}, 
  author={Ziyin Zhang and Yikang Liu and Weifang Huang and Junyu Mao and Rui Wang and Hai Hu},
  year={2023},
  eprint={2311.09033},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

GitHub Events

Total
  • Watch event: 4
  • Push event: 1
  • Fork event: 1
Last Year
  • Watch event: 4
  • Push event: 1
  • Fork event: 1