lm-evaluation-harness-medical-specialities

https://github.com/hpai-bsc/lm-evaluation-harness-medical-specialities

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
16 of 214 committers (7.5%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (4.2%) to scientific vocabulary

Keywords from Contributors

transformation cryptocurrencies jax cryptography agents rlhf large-language-models foundation-models llamaindex multi-agents

Last synced: 11 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: HPAI-BSC
License: mit
Language: Python
Default Branch: main
Size: 22.6 MB

Statistics

Stars: 3
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Citation Codeowners

Language Model Evaluation Harness with Medical Specialities Classification

Fork from the official lm-evaluation-harness repo with the task medical_specialities included to classify different questions into their medical specialities.

How to use it

Follow the official lm-evaluation-harness guide with the task medical_specialities

bash lm_eval --model hf \ --model_args pretrained=EleutherAI/pythia-160m \ --tasks medical_specialities \ --device cuda:0 \ --batch_size 8

With you will get a list of the metrics per speciality, which can help you identify if there is one category underrepresented, or other biases.

| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |-------------------|------:|------|-----:|------|---|-----:|---|-----:| |Allergy | 0|none | 0|acc |↑ |0.3200|± |0.0469| |Anatomy | 0|none | 0|acc |↑ |0.2862|± |0.0186| |Anesthesiology | 0|none | 0|acc |↑ |0.2577|± |0.0344| |Biochemistry | 0|none | 0|acc |↑ |0.2388|± |0.0104| |Cardiology | 0|none | 0|acc |↑ |0.2659|± |0.0211| |Chemistry | 0|none | 0|acc |↑ |0.2587|± |0.0193| |Dermatology | 0|none | 0|acc |↑ |0.2660|± |0.0323| |Emergency | 0|none | 0|acc |↑ |0.2871|± |0.0319| |Endocrinology | 0|none | 0|acc |↑ |0.2456|± |0.0216| |Gastroenterology | 0|none | 0|acc |↑ |0.2364|± |0.0207| |Genetics | 0|none | 0|acc |↑ |0.2776|± |0.0192| |Geriatrics | 0|none | 0|acc |↑ |0.2609|± |0.0532| |Gynecology | 0|none | 0|acc |↑ |0.3015|± |0.0395| |Hematology | 0|none | 0|acc |↑ |0.2220|± |0.0184| |Microbiology | 0|none | 0|acc |↑ |0.2576|± |0.0141| |Nephrology | 0|none | 0|acc |↑ |0.2747|± |0.0271| |Neurology | 0|none | 0|acc |↑ |0.2801|± |0.0210| |Nursing | 0|none | 0|acc |↑ |0.2374|± |0.0303| |Obstetrics | 0|none | 0|acc |↑ |0.2655|± |0.0235| |Odontology | 0|none | 0|acc |↑ |0.3337|± |0.0149| |Oncology | 0|none | 0|acc |↑ |0.2367|± |0.0272| |Ophthalmology | 0|none | 0|acc |↑ |0.2500|± |0.0367| |Orthopedics | 0|none | 0|acc |↑ |0.3180|± |0.0317| |Otorhinolaryngology| 0|none | 0|acc |↑ |0.2775|± |0.0310| |Pathology | 0|none | 0|acc |↑ |0.2680|± |0.0452| |Pediatrics | 0|none | 0|acc |↑ |0.2959|± |0.0267| |Pharmacology | 0|none | 0|acc |↑ |0.2772|± |0.0158| |Physiology | 0|none | 0|acc |↑ |0.2559|± |0.0254| |Psychiatry | 0|none | 0|acc |↑ |0.2601|± |0.0143| |Psychology | 0|none | 0|acc |↑ |0.2686|± |0.0202| |Radiology | 0|none | 0|acc |↑ |0.3371|± |0.0504| |Respiratory | 0|none | 0|acc |↑ |0.2600|± |0.0235| |Rheumatology | 0|none | 0|acc |↑ |0.2110|± |0.0393| |Surgery | 0|none | 0|acc |↑ |0.2697|± |0.0334| |Urology | 0|none | 0|acc |↑ |0.2727|± |0.0427|

More info about the datasets: https://huggingface.co/datasets/HPAI-BSC/medical-specialities
More info about the code to classify the questions: https://github.com/HPAI-BSC/medical-specialities
Notebook with usage example: link

Owner

Name: HPAI-BSC
Login: HPAI-BSC
Kind: organization
Email: hpai@bsc.es
Location: Barcelona

Website: hpai.bsc.es
Twitter: hpai_bsc
Repositories: 18
Profile: https://github.com/HPAI-BSC

Citation (CITATION.bib)

@misc{eval-harness,
  author       = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = 12,
  year         = 2023,
  publisher    = {Zenodo},
  version      = {v0.4.0},
  doi          = {10.5281/zenodo.10256836},
  url          = {https://zenodo.org/records/10256836}
}

GitHub Events

Total

Watch event: 3
Push event: 2
Pull request event: 4
Fork event: 1

Last Year

Watch event: 3
Push event: 2
Pull request event: 4
Fork event: 1

Committers

Last synced: about 1 year ago

All Time

Total Commits: 2,738
Total Committers: 214
Avg Commits per committer: 12.794
Development Distribution Score (DDS): 0.749

Past Year

Commits: 107
Committers: 51
Avg Commits per committer: 2.098
Development Distribution Score (DDS): 0.776

Top Committers

Name	Email	Commits
lintangsutawika	l**g@s**m	686
haileyschoelkopf	h**y@e**i	404
Leo Gao	l**1@g**m	341
baberabb	9****b	193
Jonathan Tow	j**1@g**m	145
Benjamin Fattori	b**i@m**a	80
Stella Biderman	s**n@g**m	73
haileyschoelkopf	h**f@y**u	49
jon-tow	j**w@J**l	39
Jason Phang	e**l@j**m	30
FarzanehNakhaee	f**0@g**m	25
cjlovering	c**g@w**u	25
Charles Foster	c**0@c**u	23
Anish Thite	a**e@g**m	20
Muennighoff	6****f	18
Aflah	7****2	17
thomasw21	2****1	17
Stephen Hogg	s**g@e**u	17
thefazzer	p**n@g**m	15
sdtblck	4****k	15
researcher2	3****2	14
nikuya3	5****3	14
Julen Etxaniz	j**a@g**m	14
Chris	c**s@a**l	14
Tian Yun	t**n@h**m	13
gk	g**n@g**m	12
h-albert-lee	g**4@g**m	12
&		12
LSinev	L****v	11
Zdenek Kasner	k**e@f**z	10
and 184 more...

Committer Domains (Top 20 + Academic)

bsc.es: 3 nicholaskross.com: 1 posteo.net: 1 rogers.com: 1 alum.mit.edu: 1 mail.ustc.edu.cn: 1 kevinwang.us: 1 a13x.io: 1 imperial.ac.uk: 1 neuralmagic.com: 1 berkeley.edu: 1 brown.edu: 1 illuin.tech: 1 allenai.org: 1 ethanhs.me: 1 shopee.com: 1 mozilla.ai: 1 umd.edu: 1 fel.cvut.cz: 1 azurro.pl: 1 ccrma.stanford.edu: 1 wpi.edu: 1 yale.edu: 1 mail.utoronto.ca: 1 mit.edu: 1 brandeis.edu: 1 nyu.edu: 1 login01.expanse.sdsc.edu: 1 dfki.de: 1 is68.ifis.cs.tu-bs.de: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 0
Total pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 3 hours
Total issue authors: 0
Total pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 3 hours
Issue authors: 0
Pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

OscarMoliina (3)
jguaschmarti (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

.github/workflows/publish.yml actions

actions/checkout v4 composite
actions/download-artifact v3 composite
actions/setup-python v4 composite
actions/upload-artifact v3 composite
pypa/gh-action-pypi-publish release/v1 composite

.github/workflows/unit_tests.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite
actions/upload-artifact v3 composite
pre-commit/action v3.0.1 composite

pyproject.toml pypi

accelerate >=0.26.0
datasets >=2.16.0
dill *
evaluate >=0.4.0
evaluate *
jsonlines *
more_itertools *
numexpr *
peft >=0.2.0
pybind11 >=2.6.2
pytablewriter *
rouge-score >=0.0.4
sacrebleu >=1.5.0
scikit-learn >=0.24.1
sqlitedict *
torch >=1.8
tqdm-multiprocess *
transformers >=4.1
word2number *
zstandard *

requirements.txt pypi

setup.py pypi

.github/workflows/new_tasks.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
tj-actions/changed-files v44.5.2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

lm-evaluation-harness-medical-specialities

Science Score: 77.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

Language Model Evaluation Harness with Medical Specialities Classification

Fork from the official lm-evaluation-harness repo with the task medical_specialities included to classify different questions into their medical specialities.

How to use it

Owner

Citation (CITATION.bib)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies