Releases | Open Source Science

py-alpaca-eval - Release v0.6.6

What's Changed

[ENH] add strict decoding OAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/394
Add blendaxai-gm-l6-vo31 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/399
Added Llama3-PBM-Nova-70B model by @PKU-Baichuan in https://github.com/tatsu-lab/alpaca_eval/pull/395
Add evaluator weightedalpacaevalgpt-4o-mini-2024-07-18 by @tongyx361 in https://github.com/tatsu-lab/alpacaeval/pull/401
Add Shopee-SlimMoA-v1 to AlpacaEval by @LLM-Alignment-sh in https://github.com/tatsu-lab/alpaca_eval/pull/398
[ENH] add metadata to completion: date, version,... by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/402
Add REBEL-Llama-3-8B-Instruct-Armo to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/403
Add Llama-3-8B-Instruct-SkillMix to AlpacaEval by @parksimon0808 in https://github.com/tatsu-lab/alpaca_eval/pull/405
Updated HF Link in modelconfigs for Llama-3-8B-Instruct-SkillMix by @parksimon0808 in https://github.com/tatsu-lab/alpacaeval/pull/409
Add SelfMoAgemma-2-9b-it-SimPO, SelfMoAgemma-2-9b-it-WPO-HB to AlpacaEval by @wenzhe-li in https://github.com/tatsu-lab/alpaca_eval/pull/411
add Self-taught-llama3.1-70B-dpo as a evaluator by @tianlu-wang in https://github.com/tatsu-lab/alpaca_eval/pull/412
Add GPO-Llama-3-8B-Instruct-GPM-2B and SPPO-Llama-3-8B-Instruct-GPM-2… by @xukp20 in https://github.com/tatsu-lab/alpaca_eval/pull/413
Add NullModel to AlpacaEval by @xszheng2020 in https://github.com/tatsu-lab/alpaca_eval/pull/414
Add Llama-3-Instruct-8B-RainbowPO to AlpacaEval by @hanyang1999 in https://github.com/tatsu-lab/alpaca_eval/pull/416
add example for Llama3 vllm server by @cameron-chen in https://github.com/tatsu-lab/alpaca_eval/pull/404
Add FuseChat-3.0 models to AlpacaEval by @yangzy39 in https://github.com/tatsu-lab/alpaca_eval/pull/426
Add TOA to AlpacaEval by @oceanypt in https://github.com/tatsu-lab/alpaca_eval/pull/428
[BUG] toolcalls by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/429

New Contributors

@PKU-Baichuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/395
@LLM-Alignment-sh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/398
@parksimon0808 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/405
@wenzhe-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/411
@tianlu-wang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/412
@xukp20 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/413
@xszheng2020 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/414
@hanyang1999 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/416
@cameron-chen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/404
@yangzy39 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/426
@oceanypt made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/428

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.5...v0.6.6

- Jupyter Notebook
Published by github-actions[bot] over 1 year ago

py-alpaca-eval - Release v0.6.5

What's Changed

Add Llama-3-Instruct-8B-WPO-HB-v2 to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/377
[ENH] add llama 3.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/378
[ENH] add example for LLama 3 vllm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/381
Add Infinity-Instruct-7M-0729-Llama31-70B, Infinity-Instruct-7M-0729-Llama31-8B, Infinity-Instruct-7M-0729-mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/383
Add gemma-2-9b-it-WPO-HB to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/384
Add link to gemma-2-9b-it-WPO-HB by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/385
Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/387
Add blendaxai-gm-l3-v35 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/389
[ENH] OpenAI use tools instead of functions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/391
[ENH] enable basedir to be a list by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/392
[ENH] add mistral v0.3, Qwen2 70b, gtp4 mini by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/393

New Contributors

@wzhouad made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/377
@ym-blendax-ai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/389

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.4...v0.6.5

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.6.4

What's Changed

Add SPPO-Llama-3-Instruct-8B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/354
Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/358
Add SPPO-Gemma-2-9B-It-PairRM to AlpacaEval by @angelahzyuan in https://github.com/tatsu-lab/alpaca_eval/pull/359
Add Infinity-Instruct-3M-0625-Models to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/364
Add Higgs Llama3-70B V2 Results by @sxjscience in https://github.com/tatsu-lab/alpaca_eval/pull/367
Added Ghost 8B Beta (d0x5) model by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/366
Add gemma-2-9b-it-SimPO and gemma-2-9b-it-DPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/368
[ENH] add CI test for unwanted files by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/369
update model links by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/370
[ENH] add the code to compute instructionfollowing by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/371
[ENH] adding simplified glm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/372
[BUG] backward compatibility vllm dosample -> usebeamsearch by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/373

New Contributors

@angelahzyuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/359
@sxjscience made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/367

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.3...v0.6.4

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.6.3

What's Changed

Add the evaluation result for our latest model by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/286
Add Ghost 7B Alpha to AlpacaEval by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/288
Add link for FsfairX-Zephyr-Chat-v0.1 by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/289
add Qwen1.5-110B-Chat self-report results by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/291
[ENH] verifying all the qwens by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/292
Enable analyzing evaluators/annotators on data without multiple generator models by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/293
Add Storm-7B to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/294
Use verified by default by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/297
Add SPPO-Mistral7B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/298
Add ExPO results to AlpacaEval by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/299
Fix typo in README.md by @tongyx361 in https://github.com/tatsu-lab/alpaca_eval/pull/302
Add Yi-Large Preview to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/304
"Add Mistral-7B+RAHF-DUAL+LoRA to AlpacaEval" by @LiuAmber in https://github.com/tatsu-lab/alpaca_eval/pull/307
[verified] Yi-large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/309
[ADD] GPT4-o by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/311
[ENH] add LC SEM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/317
llama3 evaluator by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/314
Update README.md by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/315
[CLEAN] move evaluators lb llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/318
[ENH] vicuna 1.5 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/319
Add Llama-3-Instruct-8B-SimPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/320
[ENH] Use multi threading instead of processing by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/321
Add Aligner 2B+GPT-4 Turbo (04/09) Results by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/324
Add REBEL-Llama-3-8B-Instruct to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/326
[ENH&BUG] improve VLLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/330
Add ExPO + Llama-3-Instruct-8B-SimPO results by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/331
fix model link by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/332
Add merlinite-7B-AOT to AlpacaEval by @imelnyk in https://github.com/tatsu-lab/alpaca_eval/pull/334
[BUG] fix bs in VLLM and add chatml by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/338
Add Together-MoA, Together-MoA-Lite to AlpacaEval by @IsThatYou in https://github.com/tatsu-lab/alpaca_eval/pull/342
Add Nanbeige2-16B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/345
Add claude-3-5-sonnet-20240620 to AlpacaEval by @MarjovanLier in https://github.com/tatsu-lab/alpaca_eval/pull/348
[BUG] trust repo alpacaeval by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/349
Add OpenPipe Mixture of Agents model to Alpaca Eval by @saum7800 in https://github.com/tatsu-lab/alpaca_eval/pull/347
Add Storm-7B, Storm-7B (best-of-64) to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/344
Add Infinity-Instruct-3M-0613-Mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/351

New Contributors

@hendrydong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/286
@lh0x00 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/288
@yifan123 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/294
@Edward-Sun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/298
@chujiezheng made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/299
@tongyx361 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/302
@LiuAmber made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/307
@zhuang-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/314
@xiamengzhou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/320
@ZhaolinGao made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/326
@imelnyk made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/334
@IsThatYou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/342
@MarjovanLier made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/348
@saum7800 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/347
@cszhengyh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/351

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.2...v0.6.3

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.6.2

What's Changed

[BUG] backward compatibility with AF by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/278
Add Nanbeige-Plus-Chat-v0.1 to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/279
Update README.md by @Dominic789654 in https://github.com/tatsu-lab/alpaca_eval/pull/280
[BUG] revert to GPT4 preview 1106 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/283
Add support for analyzing evaluators with custom cross-annotations by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/281
[ENH] llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/285

New Contributors

@Dominic789654 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/280
@rdnfn made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/281

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.1...v0.6.2

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.6.1

What's Changed

Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/259
Supplement for Aligner by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/261
Add Ein-70B-v0.1 to AlpacaEval by @bin-bi in https://github.com/tatsu-lab/alpaca_eval/pull/262
Add TempNet-LLaMA2-Chat to AlpacaEval by @xumao-nju in https://github.com/tatsu-lab/alpaca_eval/pull/264
Add Conifer-7B-DPO to AlpacaEval by @liulixin29 in https://github.com/tatsu-lab/alpaca_eval/pull/267
Updating link to a super fast demo! by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/268
Add Nanbeige2-8B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/274
[ENH] adding drbx and gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/275

New Contributors

@AlignInc made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/259
@bin-bi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/262
@xumao-nju made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/264
@liulixin29 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/267
@yuani114 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/274

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6...v0.6.1

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.6

What's Changed

[DATA] Add Gemma by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/242
[NOTEBOOK] adding final length correction notebook. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/244
add Mistral-7B-ReMax-v0.1 by @liziniu in https://github.com/tatsu-lab/alpaca_eval/pull/245
[ENH] add claude 3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/247
[ENH] add contextual by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/250
[ENH] add mistral large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/251
Add Samba-CoE-v0.2 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/253
Add Samba-CoE-v0.2-best-of-16 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/256
Add Mistral-ORPO-Beta to AlpacaEval by @jiwooya1000 in https://github.com/tatsu-lab/alpaca_eval/pull/257
Yann/length correction by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/258

New Contributors

@liziniu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/245
@kyleliang919 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/253
@jiwooya1000 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/257

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.4...v0.6

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.5.4

What's Changed

Add Qwen1.5-72B-Chat to AlpacaEval by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/226
Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/228
[DATA] Adding annotations for the arena models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/229
Update README.md - Add missing "Y" to "ou" by @yoderj in https://github.com/tatsu-lab/alpaca_eval/pull/230
[DEV] Analyzing length-controlled metrics. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/231
[DOC] add annotation interpretation by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/232
[DATA] add results from the Arena openai models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/234
update ELO for llama-2-13b-chat-hf by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/235
[NOTEBOOK] add length-corrected GLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/237
[ENH] add inverse mapper to make sure in and out types are the same by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/240
[ENH] update to allow AF to use AE by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/241

New Contributors

@Lukeming-tsinghua made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/226
@yoderj made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/230

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.3...v0.5.4

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.5.3

What's Changed

[ENH] add mistral-medium by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/205
[ENH] add internlm2-chat-20b-ppo by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/207
prettify "prettyname" of internlm2 by @C1rN09 in https://github.com/tatsu-lab/alpacaeval/pull/208
[ENH] add outputs & configs form dolphin 2.2.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/209
Add PairRM 0.4B + Yi-34B-Chat to AlpacaEval 2.0 by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/210
dolphin 2.1.1 configs.yaml by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/212
Update README.md (small typo) by @xwinxu in https://github.com/tatsu-lab/alpaca_eval/pull/213
[TEST]: fix ordering of df by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/214
Add Snorkel-Mistral-PairRM-DPO (best-of-16) to Alpaca Eval 2.0 by @viethoangtranduong in https://github.com/tatsu-lab/alpaca_eval/pull/215
update InternLM2 chat template by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/216
Add Starling-LM-7B-alpha, vicuna-13b-v1.5, vicuna-7b-v1.5 to AlpacaEval (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/217
[RES] add 3 models for arena correlations by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/218
Add xwinlm-70b-v0.3 to AlpacaEval by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/221
[ENH] add referencedmodels locally by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/224

New Contributors

@C1rN09 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/207
@gblazex made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/212
@xwinxu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/213
@viethoangtranduong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/215

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.2...v0.5.3

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.5.2

What's Changed

[BUG] force openai >1.5.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/202
[WIP] precompute all leaderboard for AE2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/199
[ENH] add OpenHermes by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/203

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.1...v0.5.2

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.5.1

What's Changed

[BUG] fix no OAI org id set by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/200

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.0...v0.5.1

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.5.0

What's Changed

Fix mssg check by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/174
Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/176
Add 01-ai/Yi-34B-Chat to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/175
feat: add way to verify results by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/177
show img in readme by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/178
Add PairRM best-of-16 to AlpacaEval by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/181
Verify Yi by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/182
chore: add phi-2 sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/184
add cut-13b by @wwxu21 in https://github.com/tatsu-lab/alpaca_eval/pull/186
chore: add phi-2 dpo by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/185
Support phi2, Support SOLAR 10.7B LMCocktail by @yhyu13 in https://github.com/tatsu-lab/alpaca_eval/pull/183
Update openai.py by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/188
chore: add link for phi-2-sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/190
chore: fix links by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/191
Add deita-7b-v1.0 model by @VPeterV in https://github.com/tatsu-lab/alpaca_eval/pull/192
[ENH] Azure OAI client & more general way of switching between client configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/193
[ENH] Weighted win rates by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/189
[ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/195
[ENH] alpacaeval 2.0 by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/196

New Contributors

@Muennighoff made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/174
@HyperdriveHustle made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/175
@jdf-prog made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/181
@lxuechen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/184
@wwxu21 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/186
@yhyu13 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/183
@VPeterV made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/192

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.6...v0.5.0

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.6

What's Changed

feat: verify all the cohere model & use it as eval by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/170
Add Tulu 2 models to AlpacaEval by @hamishivi in https://github.com/tatsu-lab/alpaca_eval/pull/171

New Contributors

@hamishivi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/171

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.5...v0.3.6

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.5

What's Changed

[WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
[ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169

New Contributors

@GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...v0.3.5

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release vv0.3.4

What's Changed

[WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
[ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169

New Contributors

@GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...vv0.3.4

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.3

What's Changed

Gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/159

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.2...v0.3.3

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.2

What's Changed

add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval by @lifan-yuan in https://github.com/tatsu-lab/alpaca_eval/pull/139
Add annotations & fix leaderboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/142
refresh Cohere by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/141
Add PlatoLM-7B to AlpacaEval by @renatz in https://github.com/tatsu-lab/alpaca_eval/pull/143
Add evo-7b to AlpacaEval by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/144
Add NEFTune models to AlpacaEval by @neelsjain in https://github.com/tatsu-lab/alpaca_eval/pull/146
Add claude2-alpaca-13b, recycled-wizardlm-7b-v1.0, recycled-wizardlm-… by @MingLiiii in https://github.com/tatsu-lab/alpaca_eval/pull/147
Add CausalLM/14B to AlpacaEval by @CausalLM in https://github.com/tatsu-lab/alpaca_eval/pull/148
Add Zephyr 7B evals by @lewtun in https://github.com/tatsu-lab/alpaca_eval/pull/152
Add Evo v2 7B by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/153
Add decoder for calling Anthropic models via Amazon Bedrock by @billcai in https://github.com/tatsu-lab/alpaca_eval/pull/151
cohere update by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/155
feat: upgrade to openai 1.0.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/157

New Contributors

@lifan-yuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/139
@renatz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/143
@zfang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/144
@neelsjain made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/146
@MingLiiii made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/147
@CausalLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/148
@lewtun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/152
@billcai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/151

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.1...v0.3.2

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.1

What's Changed

Add results of Xwin-LM by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/135
[ENH] add gpt 3.5 instruct by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/137

New Contributors

@nbl97 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/135

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.0...v0.3.1

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.0

What's Changed

[ENH] add fixed gpt4 version annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/127
Add openbuddy-llama2-13b-v11.1 by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/129
[ENH] add max concurrency oai by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/131

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.9...v0.3.0

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.9

What's Changed

Ensure primary keys are string & decrease processes for OpenAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/116
Add JinaChat to the leaderboards by @jupyterjazz in https://github.com/tatsu-lab/alpaca_eval/pull/117
[BUG] jina chat error in configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/118
Add Humpback to AlpacaEval by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/120
update Humpback results by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/121
add link to Humpback paper by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/122
Add vllm decoder for model inference by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/124
[ENH] return completions_all and allow sequence of maxtokens by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/125

New Contributors

@jupyterjazz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/117
@xianxl made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/120

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.8...v0.2.9

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.8

What's Changed

[BUG] closes #77 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/109
Add openbuddy-llama-30b-v7.1 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/108
Fix typo on prettyname by @44670 in https://github.com/tatsu-lab/alpacaeval/pull/110
Add openbuddy-falcon-40b-v9 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/111
[CLEAN] remove warning by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/112
[BUG] utils.DUMMYEXAMPLE by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/113

New Contributors

@44670 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/108

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.7...v0.2.8

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.7

What's Changed

Update WizardLM 13B V1.2 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/99
[ENH] llama70B and chunking by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/100
[ENH] add pipeline meta parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/103
[CLEAN] Single annotator not abstract by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/104
Add OpenChat 3.1 Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/105
[ENH] add example with HF API by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/106

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.6...v0.2.7

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.6

What's Changed

[STYLE] fix ill-formatted logging message by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/97
[STYLE] PR medium eval (ANNOTATORCOLUMN) by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/98

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.5...v0.2.6

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.5

What's Changed

[ENH] adds processors by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/95

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.4...v0.2.5

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.4

What's Changed

Add Baichuan-13B-Chat Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/85
Add ChatGLM2-6B Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/86
[ENH] add chat llama2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/87
[ENH] automatically add minimal/verified by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/88
[ENH] add replicate + llama 70B by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/90
[ENH] add llama 70B outputs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/91
[ENH] optionally return raw completions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/92
[ENH] evalparser by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/93
[ENH] json parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/94

New Contributors

@inferLLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/85

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.3...v0.2.4

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.3

What's Changed

[ENH] make completionparser easier to inherit by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/81
[ENH] Add length by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/79
[ENH] add formatsamplesheets.py to CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/82
[ENH] adding samples to leadeboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/83

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.2...v0.2.3

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.2

What's Changed

[ENH] add base annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/76
[ENH] add claude v2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/78

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.1...v0.2.2

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.1

What's Changed

Update WizardLM 13B V1.1 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/66
[ENH] make. it easier to cache to a DB by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/73
add vicuna v1.3 results by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/74
gpt4 annotations for vicuna v1.3 by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/75

New Contributors

@victorsungo made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/66

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.0...v0.2.1

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - Release v0.2.0

What's Changed

[CI] auto release by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/72

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.9...v0.2.0

- Jupyter Notebook
Published by github-actions[bot] almost 3 years ago

py-alpaca-eval - v0.1.7

What's Changed

Add Custom OpenAI API Endpoint Support and OpenChat Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/42
get falcon models running decoding by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/47
[TEST] test by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/50
[ENH] upgrade anthropic 0.3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/54
[CLEAN] black by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/55
[TEST] setting up test CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/56
Add Baize v2 13B by @JetRunner in https://github.com/tatsu-lab/alpaca_eval/pull/49
[CI] leaderboard formatting by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/58
format leaderboard for baize by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/59
[ENH] remove inputs from example by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/60
[CLEAN] setting up precommit by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/61

New Contributors

@imoneoi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/42
@JetRunner made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/49

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.6...v0.1.7.1

- Jupyter Notebook
Published by YannDubs almost 3 years ago

py-alpaca-eval - v0.1.6

- Jupyter Notebook
Published by rtaori almost 3 years ago

py-alpaca-eval - v0.1.5

Add more accessible chatgpt_fn annotator

- Jupyter Notebook
Published by rtaori almost 3 years ago

py-alpaca-eval - v0.1.3

Update requirements to python 3.10

- Jupyter Notebook
Published by rtaori almost 3 years ago

py-alpaca-eval - v0.1.1

- Jupyter Notebook
Published by lxuechen almost 3 years ago

py-alpaca-eval - v0.1.0

- Jupyter Notebook
Published by lxuechen almost 3 years ago

Recent Releases of py-alpaca-eval

py-alpaca-eval - Release v0.6.6

What's Changed

New Contributors

py-alpaca-eval - Release v0.6.5

What's Changed

New Contributors

py-alpaca-eval - Release v0.6.4

What's Changed

New Contributors

py-alpaca-eval - Release v0.6.3

What's Changed

New Contributors

py-alpaca-eval - Release v0.6.2

What's Changed

New Contributors

py-alpaca-eval - Release v0.6.1

What's Changed

New Contributors

py-alpaca-eval - Release v0.6

What's Changed

New Contributors

py-alpaca-eval - Release v0.5.4

What's Changed

New Contributors

py-alpaca-eval - Release v0.5.3

What's Changed

New Contributors

py-alpaca-eval - Release v0.5.2

What's Changed

py-alpaca-eval - Release v0.5.1

What's Changed

py-alpaca-eval - Release v0.5.0

What's Changed

New Contributors

py-alpaca-eval - Release v0.3.6

What's Changed

New Contributors

py-alpaca-eval - Release v0.3.5

What's Changed

New Contributors

py-alpaca-eval - Release vv0.3.4

What's Changed

New Contributors

py-alpaca-eval - Release v0.3.3

What's Changed

py-alpaca-eval - Release v0.3.2

What's Changed

New Contributors

py-alpaca-eval - Release v0.3.1

What's Changed

New Contributors

py-alpaca-eval - Release v0.3.0

What's Changed

py-alpaca-eval - Release v0.2.9

What's Changed

New Contributors

py-alpaca-eval - Release v0.2.8

What's Changed

New Contributors

py-alpaca-eval - Release v0.2.7

What's Changed

py-alpaca-eval - Release v0.2.6

What's Changed

py-alpaca-eval - Release v0.2.5

What's Changed

py-alpaca-eval - Release v0.2.4

What's Changed

New Contributors

py-alpaca-eval - Release v0.2.3

What's Changed

py-alpaca-eval - Release v0.2.2

What's Changed

py-alpaca-eval - Release v0.2.1

What's Changed

New Contributors

py-alpaca-eval - Release v0.2.0

What's Changed

py-alpaca-eval - v0.1.7

What's Changed