multilingualanalysis

Purpose of this repository is review the how various existing LLM's handle non-english text.

https://github.com/34-matt/multilingualanalysis

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Purpose of this repository is review the how various existing LLM's handle non-english text.

Basic Info
  • Host: GitHub
  • Owner: 34-Matt
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 165 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 10 months ago
Metadata Files
Readme Citation

README.md

MultiLingualAnalysis

Purpose of this repository is review the how various existing LLM's handle non-english text.

Setup

Install python packages with: bash pip install -r requirements.txt

Afterwards, get a HuggingFace API key to gain access to restricted models. Once the key is obtained, set it as an environment variable: bash export HUGGINGFACE_TOKEN=your_token_here

Owner

  • Name: TriBlades
  • Login: 34-Matt
  • Kind: user

Citation (citations.bib)

@inproceedings{wolf-etal-2020-transformers,
    title = "Transformers: State-of-the-Art Natural Language Processing",
    author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = oct,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
    pages = "38--45"
}

GitHub Events

Total
  • Push event: 6
  • Create event: 2
Last Year
  • Push event: 6
  • Create event: 2

Dependencies

requirements.txt pypi
  • dotenv ==0.9.9
  • transformer-lens ==2.15.0