serbian-llm-eval

Serbian LLM Eval.

https://github.com/gordicaleksa/serbian-llm-eval

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

bosnian croatian eval llm serbian
Last synced: 6 months ago · JSON representation ·

Repository

Serbian LLM Eval.

Basic Info
  • Host: GitHub
  • Owner: gordicaleksa
  • License: other
  • Language: Python
  • Default Branch: serb_eval_run
  • Homepage: https://gordicaleksa.com/
  • Size: 12.3 MB
Statistics
  • Stars: 96
  • Watchers: 5
  • Forks: 8
  • Open Issues: 2
  • Releases: 0
Topics
bosnian croatian eval llm serbian
Created about 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation Codeowners

README.md

Serbian LLM eval 🇷🇸

Note: it can likely also be used for other HBS languages (Croatian, Bosnian, Montenegrin) - support for these languages is on my roadmap (see future work).

What is currently covered:

  • Common sense reasoning: Hellaswag, Winogrande, PIQA, OpenbookQA, ARC-Easy, ARC-Challenge
  • World knowledge: NaturalQuestions, TriviaQA
  • Reading comprehension: BoolQ

You can find the Serbian LLM eval dataset on HuggingFace. For more details on how the dataset was built see this technical report on Weights & Biases. The branch serbevaltranslate was used to do machine translation, while serbevalrefine was used to do further refinement using GPT-4.

Please email me at gordicaleksa at gmail com in case you're willing to sponsor the projects I'm working on.

You will get the credits and eternal glory. :)

In Serbian: ``` I na srpskom, ukoliko ste voljni da finansijski podržite ovaj poduhvat korišćenja ChatGPT da se dobiju kvalitetniji podaci, i koji je od nacionalnog/regionalnog interesa, moj email je gordicaleksa at gmail com. Dobićete priznanje na ovom projektu da ste sponzor (i postaćete deo istorije). :)

Dalje ovaj projekat će pomoći da se pokrene lokalni large language model ekoksistem. ``````

Run the evals

Step 1. Create Python environment

git clone https://github.com/gordicaleksa/lm-evaluation-harness-serbian cd lm-evaluation-harness-serbian pip install -e .

Currently you might need to manually install also the following packages (do pip install): sentencepiece, protobuf, and one more (submit PR if you hit this).

Step 2. Tweak the launch json and run

--model_args <- any name from HuggingFace or a path to HuggingFace compatible checkpoint will work

--tasks <- pick any subset of these arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa,winogrande,nq_open,triviaqa

--num_fewshot <- set the number of shots, should be 0 for all tasks except for nq_open and triviaqa (these should be run in 5-shot manner if you want to compare against Mistral 7B)

--batch_size <- depending on your available VRAM set this as high as possible to get the max speed up

Future work:

  • Cover popular aggregated results benchmarks: MMLU, BBH, AGI Eval and math: GSM8K, MATH
  • Explicit support for other HBS languages.

Sponsors

Thanks to all of our sponsor(s) for donating for the yugoGPT (first 7B HBS LLM) & Serbian LLM eval projects.

yugoGPT base model will soon be open-source under permissive Apache 2.0 license.

Platinum sponsors

  • Ivan (anon)

Gold sponsors

Silver sponsors

Also a big thank you to the following individuals: - Slobodan Marković - for spreading the word! :) - Aleksander Segedi - for help around bookkeeping

Credits

A huge thank you to the following technical contributors who helped translate the evals from English into Serbian: * Vera Prohaska * Chu Kin Chan * Joe Makepeace * Toby Farmer * Malvi Bid * Raphael Vienne * Nenad Aksentijevic * Isaac Nicolas * Brian Pulfer * Aldin Cimpo

License

Apache 2.0

Citation

@article{serbian-llm-eval, author = "Gordić Aleksa", title = "Serbian LLM Eval", year = "2023" howpublished = {\url{https://huggingface.co/datasets/gordicaleksa/serbian-llm-eval-v1}}, }

Owner

  • Name: Aleksa Gordić
  • Login: gordicaleksa
  • Kind: user
  • Location: San Francisco
  • Company: ex-DeepMind, ex-Microsoft

Flirting with LLMs. Tensor Core maximalist. If I say stupid stuff it's not me it's my prompt.

Citation (CITATION.bib)

@software{eval-harness,
  author       = {Gao, Leo and
                  Tow, Jonathan and
                  Biderman, Stella and
                  Black, Sid and
                  DiPofi, Anthony and
                  Foster, Charles and
                  Golding, Laurence and
                  Hsu, Jeffrey and
                  McDonell, Kyle and
                  Muennighoff, Niklas and
                  Phang, Jason and
                  Reynolds, Laria and
                  Tang, Eric and
                  Thite, Anish and
                  Wang, Ben and
                  Wang, Kevin and
                  Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = sep,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {v0.0.1},
  doi          = {10.5281/zenodo.5371628},
  url          = {https://doi.org/10.5281/zenodo.5371628}
}

GitHub Events

Total
  • Watch event: 10
  • Fork event: 1
Last Year
  • Watch event: 10
  • Fork event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 1
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 1
  • Total pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: about 2 hours
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ikamensh (1)
Pull Request Authors
  • Stopwolf (2)
  • malvibid (1)
  • BrianPulfer (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Dockerfile docker
  • nvidia/cuda 11.2.2-cudnn8-runtime-ubuntu20.04 build
requirements.txt pypi
setup.py pypi
  • accelerate >=0.17.1
  • datasets >=2.0.0
  • einops *
  • importlib-resources *
  • jsonlines *
  • numexpr *
  • omegaconf >=2.2
  • openai >=0.6.4
  • peft >=0.2.0
  • pybind11 >=2.6.2
  • pycountry *
  • pytablewriter *
  • rouge-score >=0.0.4
  • sacrebleu ==1.5.0
  • scikit-learn >=0.24.1
  • sqlitedict *
  • torch >=1.7
  • tqdm-multiprocess *
  • transformers >=4.1
  • zstandard *