slovenian-llm-eval

Slovenian LLM Eval.

https://github.com/gordicaleksa/slovenian-llm-eval

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 87% confidence

Last synced: 6 months ago · JSON representation ·

Repository

Slovenian LLM Eval.

Basic Info

Host: GitHub
Owner: gordicaleksa
License: mit
Language: Python
Default Branch: slovenian_eval_translate
Size: 12.2 MB

Statistics

Stars: 7
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation Codeowners

Slovenian LLM eval

Currently supported:

ARC-Challenge

What we want to support:

Common sense reasoning: Hellaswag, Winogrande, PIQA, OpenbookQA, ARC-Easy, ARC-Challenge
World knowledge: NaturalQuestions, TriviaQA
Reading comprehension: BoolQ

Please email me at gordicaleksa at gmail com in case you're willing to sponsor the automated GPT-4 effort. You will get the credits and eternal glory. :)

Creating the eval - instructions

IMPORTANT

running this this will eat your google cloud credits or will bill you if you're already in the billing mode (this happens after you spend free credits and then deliberately enable billing again).
you can use your free credits to translate 500.000 chars / month!
if this is the first time you're creating a gcloud project you'll have 300$ of free credits!
sync with Aleksa in Discord in slovenian-eval channel on which tasks to tackle next.

Prerequisites

Before you begin, ensure you meet the following requirements:

For Linux Users:

Git and Miniconda

For Windows Users: 1. Windows Subsystem for Linux (WSL2). If you don't have WSL2 installed, follow these steps in Windows cmd/powershell in administrator mode:

```bash
wsl --install

// Check version and distribution name. wsl -l -v

// Set the newly downloaded linux distro as default.
wsl --set-default <distribution name>
```

Install Git from the WSL terminal.

bash sudo apt update sudo apt install git git --version
Install Miniconda from the WSL terminal. ```bash mkdir -p ~/miniconda3

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh

bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3

rm -rf ~/miniconda3/miniconda.sh

// Initialize conda with bash. ~/miniconda3/bin/conda init bash ```
Follow the instructions below on WSL.

Instructions for translating lm harness eval from English into Slovenian

First let's setup a minimal Python program that makes sure you can run Google Translate on your local machine.

Create a Google Console project (https://console.cloud.google.com/)
Enable Google Translation API -> to enable it you have to setup the billing and input your credit card details (a note regarding safety: you'll have 300$ of free credit (if this is the first time you're doing it) and no one can spend money from your credit card unless all those free credits are spent and you re-enable the billing again! if you already had it setup in that case you have 500.000 chars/month for free!)
Install Google Cloud CLI (gsutil) on your machine (see this: https://cloud.google.com/storage/docs/gsutil_install/)

a.) Download the Linux archive file (find latest version from link above)

curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-455.0.0-linux-x86_64.tar.gz

b.) Extract the contents from the archive file above.

tar -xf google-cloud-cli-455.0.0-linux-x86_64.tar.gz

c.) Run installation script. ./google-cloud-sdk/install.sh

d.) Initiate and authenticate your account. ./google-cloud-sdk/bin/gcloud init

e.) Create a credentials file with gcloud auth application-default login
Create and setting up the conda env

a.) Open a terminal (if on Windows use the WSL terminal, if you're on Linux just use your terminal conda will already be in the PATH)

b.) Run conda create -n open_nllb python=3.10 -y

c.) Run conda activate open_nllb

d.) Run pip install google-cloud-translate

That's it! After that just create a test.py Python file with the following code and run with Run and Debug option in VS code after creating the launch.json file:

```Python from google.cloud import translate

client = translate.TranslationServiceClient() location = "global" projectid="" parent = f"projects/{projectid}/locations/{location}"

response = client.translatetext( request={ "parent": parent, "contents": ["How do you do? Translate this."], "mimetype": "text/plain", "sourcelanguagecode": "en-US", "targetlanguagecode": "sl", } ) valuetranslated = response.translations[0].translatedtext print(value_translated) ```

Running translation of evals from English into Slovenian

Follow these instructions (see below for more details): 1. Create a Python env for this project 2. You'll find the program arguments are already specified inside .vscode/launch.json 3. Change translation_project_id to the google project id you got in the previous section 4. Specify amount of characters you're willing to translate (500_000 is the usual free monthly limit) 5. Run the main.py

Create Python environment

You can reuse the above conda env open-nllb. Next up navigate to the root of the project and run pip install -e .

If you encounter any issue please report immediately on Discord! :) We'll fix it quickly.

Run translation

Finally run (note model and model_args are not important for us but we need to specify them):

python main.py \ --model hf \ --model_args pretrained=mistralai/Mistral-7B-v0.1 \ --tasks <task> \ --translation_project_id <your project id> --char_limit 500000 --start_from_doc_index 0

or open main.py and run using vscode debugger.

Note: * again please sync on Discord about which tasks you should help to translate! :) * select only one task at a time, posssible options: hellaswag,winogrande,piqa,openbookqa,arc_easy,arc_challenge,nq_open,triviaqa,boolq * start_from_doc_index is used if you want to resume and translate a particular task only starting from a certain document index (useful in a collaborative setting where multiple people are translating different portions of the task)

Credits

todo after the project is completed

Owner

Name: Aleksa Gordić
Login: gordicaleksa
Kind: user
Location: San Francisco
Company: ex-DeepMind, ex-Microsoft

Website: https://gordicaleksa.com/
Twitter: gordic_aleksa
Repositories: 54
Profile: https://github.com/gordicaleksa

Flirting with LLMs. Tensor Core maximalist. If I say stupid stuff it's not me it's my prompt.

Citation (CITATION.bib)

@software{eval-harness,
  author       = {Gao, Leo and
                  Tow, Jonathan and
                  Biderman, Stella and
                  Black, Sid and
                  DiPofi, Anthony and
                  Foster, Charles and
                  Golding, Laurence and
                  Hsu, Jeffrey and
                  McDonell, Kyle and
                  Muennighoff, Niklas and
                  Phang, Jason and
                  Reynolds, Laria and
                  Tang, Eric and
                  Thite, Anish and
                  Wang, Ben and
                  Wang, Kevin and
                  Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = sep,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {v0.0.1},
  doi          = {10.5281/zenodo.5371628},
  url          = {https://doi.org/10.5281/zenodo.5371628}
}

GitHub Events

Total

Watch event: 1
Fork event: 1

Last Year

Watch event: 1
Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Dockerfile docker

nvidia/cuda 11.2.2-cudnn8-runtime-ubuntu20.04 build

requirements.txt pypi

setup.py pypi

accelerate >=0.17.1
datasets >=2.0.0
einops *
importlib-resources *
jsonlines *
numexpr *
omegaconf >=2.2
openai >=0.6.4
peft >=0.2.0
pybind11 >=2.6.2
pycountry *
pytablewriter *
rouge-score >=0.0.4
sacrebleu ==1.5.0
scikit-learn >=0.24.1
sqlitedict *
torch >=1.7
tqdm-multiprocess *
transformers >=4.1
zstandard *