lm-evaluation-harness-medical-specialities
https://github.com/hpai-bsc/lm-evaluation-harness-medical-specialities
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
16 of 214 committers (7.5%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.2%) to scientific vocabulary
Keywords from Contributors
Repository
Basic Info
- Host: GitHub
- Owner: HPAI-BSC
- License: mit
- Language: Python
- Default Branch: main
- Size: 22.6 MB
Statistics
- Stars: 3
- Watchers: 2
- Forks: 2
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Language Model Evaluation Harness with Medical Specialities Classification
Fork from the official lm-evaluation-harness repo with the task medical_specialities included to classify different questions into their medical specialities.
How to use it
Follow the official lm-evaluation-harness guide with the task medical_specialities
bash
lm_eval --model hf \
--model_args pretrained=EleutherAI/pythia-160m \
--tasks medical_specialities \
--device cuda:0 \
--batch_size 8
With you will get a list of the metrics per speciality, which can help you identify if there is one category underrepresented, or other biases.
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |-------------------|------:|------|-----:|------|---|-----:|---|-----:| |Allergy | 0|none | 0|acc |↑ |0.3200|± |0.0469| |Anatomy | 0|none | 0|acc |↑ |0.2862|± |0.0186| |Anesthesiology | 0|none | 0|acc |↑ |0.2577|± |0.0344| |Biochemistry | 0|none | 0|acc |↑ |0.2388|± |0.0104| |Cardiology | 0|none | 0|acc |↑ |0.2659|± |0.0211| |Chemistry | 0|none | 0|acc |↑ |0.2587|± |0.0193| |Dermatology | 0|none | 0|acc |↑ |0.2660|± |0.0323| |Emergency | 0|none | 0|acc |↑ |0.2871|± |0.0319| |Endocrinology | 0|none | 0|acc |↑ |0.2456|± |0.0216| |Gastroenterology | 0|none | 0|acc |↑ |0.2364|± |0.0207| |Genetics | 0|none | 0|acc |↑ |0.2776|± |0.0192| |Geriatrics | 0|none | 0|acc |↑ |0.2609|± |0.0532| |Gynecology | 0|none | 0|acc |↑ |0.3015|± |0.0395| |Hematology | 0|none | 0|acc |↑ |0.2220|± |0.0184| |Microbiology | 0|none | 0|acc |↑ |0.2576|± |0.0141| |Nephrology | 0|none | 0|acc |↑ |0.2747|± |0.0271| |Neurology | 0|none | 0|acc |↑ |0.2801|± |0.0210| |Nursing | 0|none | 0|acc |↑ |0.2374|± |0.0303| |Obstetrics | 0|none | 0|acc |↑ |0.2655|± |0.0235| |Odontology | 0|none | 0|acc |↑ |0.3337|± |0.0149| |Oncology | 0|none | 0|acc |↑ |0.2367|± |0.0272| |Ophthalmology | 0|none | 0|acc |↑ |0.2500|± |0.0367| |Orthopedics | 0|none | 0|acc |↑ |0.3180|± |0.0317| |Otorhinolaryngology| 0|none | 0|acc |↑ |0.2775|± |0.0310| |Pathology | 0|none | 0|acc |↑ |0.2680|± |0.0452| |Pediatrics | 0|none | 0|acc |↑ |0.2959|± |0.0267| |Pharmacology | 0|none | 0|acc |↑ |0.2772|± |0.0158| |Physiology | 0|none | 0|acc |↑ |0.2559|± |0.0254| |Psychiatry | 0|none | 0|acc |↑ |0.2601|± |0.0143| |Psychology | 0|none | 0|acc |↑ |0.2686|± |0.0202| |Radiology | 0|none | 0|acc |↑ |0.3371|± |0.0504| |Respiratory | 0|none | 0|acc |↑ |0.2600|± |0.0235| |Rheumatology | 0|none | 0|acc |↑ |0.2110|± |0.0393| |Surgery | 0|none | 0|acc |↑ |0.2697|± |0.0334| |Urology | 0|none | 0|acc |↑ |0.2727|± |0.0427|
- More info about the datasets: https://huggingface.co/datasets/HPAI-BSC/medical-specialities
- More info about the code to classify the questions: https://github.com/HPAI-BSC/medical-specialities
- Notebook with usage example: link
Owner
- Name: HPAI-BSC
- Login: HPAI-BSC
- Kind: organization
- Email: hpai@bsc.es
- Location: Barcelona
- Website: hpai.bsc.es
- Twitter: hpai_bsc
- Repositories: 18
- Profile: https://github.com/HPAI-BSC
Citation (CITATION.bib)
@misc{eval-harness,
author = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},
title = {A framework for few-shot language model evaluation},
month = 12,
year = 2023,
publisher = {Zenodo},
version = {v0.4.0},
doi = {10.5281/zenodo.10256836},
url = {https://zenodo.org/records/10256836}
}
GitHub Events
Total
- Watch event: 3
- Push event: 2
- Pull request event: 4
- Fork event: 1
Last Year
- Watch event: 3
- Push event: 2
- Pull request event: 4
- Fork event: 1
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| lintangsutawika | l****g@s****m | 686 |
| haileyschoelkopf | h****y@e****i | 404 |
| Leo Gao | l****1@g****m | 341 |
| baberabb | 9****b | 193 |
| Jonathan Tow | j****1@g****m | 145 |
| Benjamin Fattori | b****i@m****a | 80 |
| Stella Biderman | s****n@g****m | 73 |
| haileyschoelkopf | h****f@y****u | 49 |
| jon-tow | j****w@J****l | 39 |
| Jason Phang | e****l@j****m | 30 |
| FarzanehNakhaee | f****0@g****m | 25 |
| cjlovering | c****g@w****u | 25 |
| Charles Foster | c****0@c****u | 23 |
| Anish Thite | a****e@g****m | 20 |
| Muennighoff | 6****f | 18 |
| Aflah | 7****2 | 17 |
| thomasw21 | 2****1 | 17 |
| Stephen Hogg | s****g@e****u | 17 |
| thefazzer | p****n@g****m | 15 |
| sdtblck | 4****k | 15 |
| researcher2 | 3****2 | 14 |
| nikuya3 | 5****3 | 14 |
| Julen Etxaniz | j****a@g****m | 14 |
| Chris | c****s@a****l | 14 |
| Tian Yun | t****n@h****m | 13 |
| gk | g****n@g****m | 12 |
| h-albert-lee | g****4@g****m | 12 |
| & | 12 | |
| LSinev | L****v | 11 |
| Zdenek Kasner | k****e@f****z | 10 |
| and 184 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: about 3 hours
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: about 3 hours
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- OscarMoliina (3)
- jguaschmarti (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v4 composite
- actions/download-artifact v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- pypa/gh-action-pypi-publish release/v1 composite
- actions/checkout v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v3 composite
- pre-commit/action v3.0.1 composite
- accelerate >=0.26.0
- datasets >=2.16.0
- dill *
- evaluate >=0.4.0
- evaluate *
- jsonlines *
- more_itertools *
- numexpr *
- peft >=0.2.0
- pybind11 >=2.6.2
- pytablewriter *
- rouge-score >=0.0.4
- sacrebleu >=1.5.0
- scikit-learn >=0.24.1
- sqlitedict *
- torch >=1.8
- tqdm-multiprocess *
- transformers >=4.1
- word2number *
- zstandard *
- actions/checkout v3 composite
- actions/setup-python v4 composite
- tj-actions/changed-files v44.5.2 composite