serbian-llm-eval

Serbian LLM Eval.

https://github.com/gordicaleksa/serbian-llm-eval

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Keywords

bosnian croatian eval llm serbian

Last synced: 6 months ago · JSON representation ·

Repository

Serbian LLM Eval.

Basic Info

Host: GitHub
Owner: gordicaleksa
License: other
Language: Python
Default Branch: serb_eval_run
Homepage: https://gordicaleksa.com/
Size: 12.3 MB

Statistics

Stars: 96
Watchers: 5
Forks: 8
Open Issues: 2
Releases: 0

Topics

bosnian croatian eval llm serbian

Created about 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation Codeowners

Serbian LLM eval 🇷🇸

Note: it can likely also be used for other HBS languages (Croatian, Bosnian, Montenegrin) - support for these languages is on my roadmap (see future work).

What is currently covered:

Common sense reasoning: Hellaswag, Winogrande, PIQA, OpenbookQA, ARC-Easy, ARC-Challenge
World knowledge: NaturalQuestions, TriviaQA
Reading comprehension: BoolQ

You can find the Serbian LLM eval dataset on HuggingFace. For more details on how the dataset was built see this technical report on Weights & Biases. The branch serbevaltranslate was used to do machine translation, while serbevalrefine was used to do further refinement using GPT-4.

Please email me at gordicaleksa at gmail com in case you're willing to sponsor the projects I'm working on.

You will get the credits and eternal glory. :)

In Serbian: ``` I na srpskom, ukoliko ste voljni da finansijski podržite ovaj poduhvat korišćenja ChatGPT da se dobiju kvalitetniji podaci, i koji je od nacionalnog/regionalnog interesa, moj email je gordicaleksa at gmail com. Dobićete priznanje na ovom projektu da ste sponzor (i postaćete deo istorije). :)

Dalje ovaj projekat će pomoći da se pokrene lokalni large language model ekoksistem. ``````

Run the evals

Step 1. Create Python environment

git clone https://github.com/gordicaleksa/lm-evaluation-harness-serbian cd lm-evaluation-harness-serbian pip install -e .

Currently you might need to manually install also the following packages (do pip install): sentencepiece, protobuf, and one more (submit PR if you hit this).

Step 2. Tweak the launch json and run

--model_args <- any name from HuggingFace or a path to HuggingFace compatible checkpoint will work

--tasks <- pick any subset of these arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa,winogrande,nq_open,triviaqa

--num_fewshot <- set the number of shots, should be 0 for all tasks except for nq_open and triviaqa (these should be run in 5-shot manner if you want to compare against Mistral 7B)

--batch_size <- depending on your available VRAM set this as high as possible to get the max speed up

Future work:

Cover popular aggregated results benchmarks: MMLU, BBH, AGI Eval and math: GSM8K, MATH
Explicit support for other HBS languages.

Sponsors

Thanks to all of our sponsor(s) for donating for the yugoGPT (first 7B HBS LLM) & Serbian LLM eval projects.

yugoGPT base model will soon be open-source under permissive Apache 2.0 license.

Platinum sponsors

Ivan (anon)

Gold sponsors

Silver sponsors

Also a big thank you to the following individuals: - Slobodan Marković - for spreading the word! :) - Aleksander Segedi - for help around bookkeeping

Credits

A huge thank you to the following technical contributors who helped translate the evals from English into Serbian: * Vera Prohaska * Chu Kin Chan * Joe Makepeace * Toby Farmer * Malvi Bid * Raphael Vienne * Nenad Aksentijevic * Isaac Nicolas * Brian Pulfer * Aldin Cimpo

License

Apache 2.0

Citation

@article{serbian-llm-eval, author = "Gordić Aleksa", title = "Serbian LLM Eval", year = "2023" howpublished = {\url{https://huggingface.co/datasets/gordicaleksa/serbian-llm-eval-v1}}, }

Owner

Name: Aleksa Gordić
Login: gordicaleksa
Kind: user
Location: San Francisco
Company: ex-DeepMind, ex-Microsoft

Website: https://gordicaleksa.com/
Twitter: gordic_aleksa
Repositories: 54
Profile: https://github.com/gordicaleksa

Flirting with LLMs. Tensor Core maximalist. If I say stupid stuff it's not me it's my prompt.

Citation (CITATION.bib)

@software{eval-harness,
  author       = {Gao, Leo and
                  Tow, Jonathan and
                  Biderman, Stella and
                  Black, Sid and
                  DiPofi, Anthony and
                  Foster, Charles and
                  Golding, Laurence and
                  Hsu, Jeffrey and
                  McDonell, Kyle and
                  Muennighoff, Niklas and
                  Phang, Jason and
                  Reynolds, Laria and
                  Tang, Eric and
                  Thite, Anish and
                  Wang, Ben and
                  Wang, Kevin and
                  Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = sep,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {v0.0.1},
  doi          = {10.5281/zenodo.5371628},
  url          = {https://doi.org/10.5281/zenodo.5371628}
}

GitHub Events

Total

Watch event: 10
Fork event: 1

Last Year

Watch event: 10
Fork event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 1
Total pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 2 hours
Total issue authors: 1
Total pull request authors: 3
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 2 hours
Issue authors: 1
Pull request authors: 3
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

ikamensh (1)

Pull Request Authors

Stopwolf (2)
malvibid (1)
BrianPulfer (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Dockerfile docker

nvidia/cuda 11.2.2-cudnn8-runtime-ubuntu20.04 build

requirements.txt pypi

setup.py pypi

accelerate >=0.17.1
datasets >=2.0.0
einops *
importlib-resources *
jsonlines *
numexpr *
omegaconf >=2.2
openai >=0.6.4
peft >=0.2.0
pybind11 >=2.6.2
pycountry *
pytablewriter *
rouge-score >=0.0.4
sacrebleu ==1.5.0
scikit-learn >=0.24.1
sqlitedict *
torch >=1.7
tqdm-multiprocess *
transformers >=4.1
zstandard *