Recent Releases of py-alpaca-eval
py-alpaca-eval - Release v0.6.6
What's Changed
- [ENH] add strict decoding OAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/394
- Add blendaxai-gm-l6-vo31 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/399
- Added Llama3-PBM-Nova-70B model by @PKU-Baichuan in https://github.com/tatsu-lab/alpaca_eval/pull/395
- Add evaluator weightedalpacaevalgpt-4o-mini-2024-07-18 by @tongyx361 in https://github.com/tatsu-lab/alpacaeval/pull/401
- Add Shopee-SlimMoA-v1 to AlpacaEval by @LLM-Alignment-sh in https://github.com/tatsu-lab/alpaca_eval/pull/398
- [ENH] add metadata to completion: date, version,... by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/402
- Add REBEL-Llama-3-8B-Instruct-Armo to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/403
- Add Llama-3-8B-Instruct-SkillMix to AlpacaEval by @parksimon0808 in https://github.com/tatsu-lab/alpaca_eval/pull/405
- Updated HF Link in modelconfigs for Llama-3-8B-Instruct-SkillMix by @parksimon0808 in https://github.com/tatsu-lab/alpacaeval/pull/409
- Add SelfMoAgemma-2-9b-it-SimPO, SelfMoAgemma-2-9b-it-WPO-HB to AlpacaEval by @wenzhe-li in https://github.com/tatsu-lab/alpaca_eval/pull/411
- add Self-taught-llama3.1-70B-dpo as a evaluator by @tianlu-wang in https://github.com/tatsu-lab/alpaca_eval/pull/412
- Add GPO-Llama-3-8B-Instruct-GPM-2B and SPPO-Llama-3-8B-Instruct-GPM-2… by @xukp20 in https://github.com/tatsu-lab/alpaca_eval/pull/413
- Add NullModel to AlpacaEval by @xszheng2020 in https://github.com/tatsu-lab/alpaca_eval/pull/414
- Add Llama-3-Instruct-8B-RainbowPO to AlpacaEval by @hanyang1999 in https://github.com/tatsu-lab/alpaca_eval/pull/416
- add example for Llama3 vllm server by @cameron-chen in https://github.com/tatsu-lab/alpaca_eval/pull/404
- Add FuseChat-3.0 models to AlpacaEval by @yangzy39 in https://github.com/tatsu-lab/alpaca_eval/pull/426
- Add TOA to AlpacaEval by @oceanypt in https://github.com/tatsu-lab/alpaca_eval/pull/428
- [BUG] toolcalls by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/429
New Contributors
- @PKU-Baichuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/395
- @LLM-Alignment-sh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/398
- @parksimon0808 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/405
- @wenzhe-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/411
- @tianlu-wang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/412
- @xukp20 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/413
- @xszheng2020 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/414
- @hanyang1999 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/416
- @cameron-chen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/404
- @yangzy39 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/426
- @oceanypt made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/428
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.5...v0.6.6
- Jupyter Notebook
Published by github-actions[bot] about 1 year ago
py-alpaca-eval - Release v0.6.5
What's Changed
- Add Llama-3-Instruct-8B-WPO-HB-v2 to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/377
- [ENH] add llama 3.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/378
- [ENH] add example for LLama 3 vllm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/381
- Add Infinity-Instruct-7M-0729-Llama31-70B, Infinity-Instruct-7M-0729-Llama31-8B, Infinity-Instruct-7M-0729-mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/383
- Add gemma-2-9b-it-WPO-HB to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/384
- Add link to gemma-2-9b-it-WPO-HB by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/385
- Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/387
- Add blendaxai-gm-l3-v35 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/389
- [ENH] OpenAI use tools instead of functions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/391
- [ENH] enable basedir to be a list by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/392
- [ENH] add mistral v0.3, Qwen2 70b, gtp4 mini by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/393
New Contributors
- @wzhouad made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/377
- @ym-blendax-ai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/389
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.4...v0.6.5
- Jupyter Notebook
Published by github-actions[bot] over 1 year ago
py-alpaca-eval - Release v0.6.4
What's Changed
- Add SPPO-Llama-3-Instruct-8B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/354
- Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/358
- Add SPPO-Gemma-2-9B-It-PairRM to AlpacaEval by @angelahzyuan in https://github.com/tatsu-lab/alpaca_eval/pull/359
- Add Infinity-Instruct-3M-0625-Models to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/364
- Add Higgs Llama3-70B V2 Results by @sxjscience in https://github.com/tatsu-lab/alpaca_eval/pull/367
- Added Ghost 8B Beta (d0x5) model by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/366
- Add gemma-2-9b-it-SimPO and gemma-2-9b-it-DPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/368
- [ENH] add CI test for unwanted files by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/369
- update model links by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/370
- [ENH] add the code to compute instructionfollowing by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/371
- [ENH] adding simplified glm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/372
- [BUG] backward compatibility vllm dosample -> usebeamsearch by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/373
New Contributors
- @angelahzyuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/359
- @sxjscience made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/367
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.3...v0.6.4
- Jupyter Notebook
Published by github-actions[bot] over 1 year ago
py-alpaca-eval - Release v0.6.3
What's Changed
- Add the evaluation result for our latest model by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/286
- Add Ghost 7B Alpha to AlpacaEval by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/288
- Add link for FsfairX-Zephyr-Chat-v0.1 by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/289
- add Qwen1.5-110B-Chat self-report results by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/291
- [ENH] verifying all the qwens by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/292
- Enable analyzing evaluators/annotators on data without multiple generator models by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/293
- Add Storm-7B to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/294
- Use verified by default by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/297
- Add SPPO-Mistral7B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/298
- Add ExPO results to AlpacaEval by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/299
- Fix typo in README.md by @tongyx361 in https://github.com/tatsu-lab/alpaca_eval/pull/302
- Add Yi-Large Preview to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/304
- "Add Mistral-7B+RAHF-DUAL+LoRA to AlpacaEval" by @LiuAmber in https://github.com/tatsu-lab/alpaca_eval/pull/307
- [verified] Yi-large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/309
- [ADD] GPT4-o by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/311
- [ENH] add LC SEM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/317
- llama3 evaluator by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/314
- Update README.md by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/315
- [CLEAN] move evaluators lb llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/318
- [ENH] vicuna 1.5 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/319
- Add Llama-3-Instruct-8B-SimPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/320
- [ENH] Use multi threading instead of processing by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/321
- Add Aligner 2B+GPT-4 Turbo (04/09) Results by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/324
- Add REBEL-Llama-3-8B-Instruct to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/326
- [ENH&BUG] improve VLLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/330
- Add ExPO +
Llama-3-Instruct-8B-SimPOresults by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/331 - fix model link by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/332
- Add merlinite-7B-AOT to AlpacaEval by @imelnyk in https://github.com/tatsu-lab/alpaca_eval/pull/334
- [BUG] fix bs in VLLM and add chatml by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/338
- Add Together-MoA, Together-MoA-Lite to AlpacaEval by @IsThatYou in https://github.com/tatsu-lab/alpaca_eval/pull/342
- Add Nanbeige2-16B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/345
- Add claude-3-5-sonnet-20240620 to AlpacaEval by @MarjovanLier in https://github.com/tatsu-lab/alpaca_eval/pull/348
- [BUG] trust repo alpacaeval by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/349
- Add OpenPipe Mixture of Agents model to Alpaca Eval by @saum7800 in https://github.com/tatsu-lab/alpaca_eval/pull/347
- Add Storm-7B, Storm-7B (best-of-64) to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/344
- Add Infinity-Instruct-3M-0613-Mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/351
New Contributors
- @hendrydong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/286
- @lh0x00 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/288
- @yifan123 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/294
- @Edward-Sun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/298
- @chujiezheng made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/299
- @tongyx361 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/302
- @LiuAmber made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/307
- @zhuang-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/314
- @xiamengzhou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/320
- @ZhaolinGao made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/326
- @imelnyk made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/334
- @IsThatYou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/342
- @MarjovanLier made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/348
- @saum7800 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/347
- @cszhengyh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/351
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.2...v0.6.3
- Jupyter Notebook
Published by github-actions[bot] over 1 year ago
py-alpaca-eval - Release v0.6.2
What's Changed
- [BUG] backward compatibility with AF by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/278
- Add Nanbeige-Plus-Chat-v0.1 to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/279
- Update README.md by @Dominic789654 in https://github.com/tatsu-lab/alpaca_eval/pull/280
- [BUG] revert to GPT4 preview 1106 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/283
- Add support for analyzing evaluators with custom cross-annotations by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/281
- [ENH] llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/285
New Contributors
- @Dominic789654 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/280
- @rdnfn made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/281
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.1...v0.6.2
- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago
py-alpaca-eval - Release v0.6.1
What's Changed
- Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/259
- Supplement for Aligner by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/261
- Add Ein-70B-v0.1 to AlpacaEval by @bin-bi in https://github.com/tatsu-lab/alpaca_eval/pull/262
- Add TempNet-LLaMA2-Chat to AlpacaEval by @xumao-nju in https://github.com/tatsu-lab/alpaca_eval/pull/264
- Add Conifer-7B-DPO to AlpacaEval by @liulixin29 in https://github.com/tatsu-lab/alpaca_eval/pull/267
- Updating link to a super fast demo! by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/268
- Add Nanbeige2-8B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/274
- [ENH] adding drbx and gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/275
New Contributors
- @AlignInc made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/259
- @bin-bi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/262
- @xumao-nju made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/264
- @liulixin29 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/267
- @yuani114 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/274
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6...v0.6.1
- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago
py-alpaca-eval - Release v0.6
What's Changed
- [DATA] Add Gemma by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/242
- [NOTEBOOK] adding final length correction notebook. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/244
- add Mistral-7B-ReMax-v0.1 by @liziniu in https://github.com/tatsu-lab/alpaca_eval/pull/245
- [ENH] add claude 3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/247
- [ENH] add contextual by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/250
- [ENH] add mistral large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/251
- Add Samba-CoE-v0.2 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/253
- Add Samba-CoE-v0.2-best-of-16 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/256
- Add Mistral-ORPO-Beta to AlpacaEval by @jiwooya1000 in https://github.com/tatsu-lab/alpaca_eval/pull/257
- Yann/length correction by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/258
New Contributors
- @liziniu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/245
- @kyleliang919 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/253
- @jiwooya1000 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/257
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.4...v0.6
- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago
py-alpaca-eval - Release v0.5.4
What's Changed
- Add Qwen1.5-72B-Chat to AlpacaEval by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/226
- Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/228
- [DATA] Adding annotations for the arena models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/229
- Update README.md - Add missing "Y" to "ou" by @yoderj in https://github.com/tatsu-lab/alpaca_eval/pull/230
- [DEV] Analyzing length-controlled metrics. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/231
- [DOC] add annotation interpretation by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/232
- [DATA] add results from the Arena openai models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/234
- update ELO for llama-2-13b-chat-hf by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/235
- [NOTEBOOK] add length-corrected GLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/237
- [ENH] add inverse mapper to make sure in and out types are the same by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/240
- [ENH] update to allow AF to use AE by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/241
New Contributors
- @Lukeming-tsinghua made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/226
- @yoderj made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/230
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.3...v0.5.4
- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago
py-alpaca-eval - Release v0.5.3
What's Changed
- [ENH] add mistral-medium by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/205
- [ENH] add internlm2-chat-20b-ppo by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/207
- prettify "prettyname" of internlm2 by @C1rN09 in https://github.com/tatsu-lab/alpacaeval/pull/208
- [ENH] add outputs & configs form dolphin 2.2.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/209
- Add PairRM 0.4B + Yi-34B-Chat to AlpacaEval 2.0 by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/210
- dolphin 2.1.1 configs.yaml by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/212
- Update README.md (small typo) by @xwinxu in https://github.com/tatsu-lab/alpaca_eval/pull/213
- [TEST]: fix ordering of df by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/214
- Add Snorkel-Mistral-PairRM-DPO (best-of-16) to Alpaca Eval 2.0 by @viethoangtranduong in https://github.com/tatsu-lab/alpaca_eval/pull/215
- update InternLM2 chat template by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/216
- Add Starling-LM-7B-alpha, vicuna-13b-v1.5, vicuna-7b-v1.5 to AlpacaEval (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/217
- [RES] add 3 models for arena correlations by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/218
- Add xwinlm-70b-v0.3 to AlpacaEval by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/221
- [ENH] add referencedmodels locally by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/224
New Contributors
- @C1rN09 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/207
- @gblazex made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/212
- @xwinxu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/213
- @viethoangtranduong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/215
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.2...v0.5.3
- Jupyter Notebook
Published by github-actions[bot] about 2 years ago
py-alpaca-eval - Release v0.5.2
What's Changed
- [BUG] force openai >1.5.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/202
- [WIP] precompute all leaderboard for AE2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/199
- [ENH] add OpenHermes by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/203
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.1...v0.5.2
- Jupyter Notebook
Published by github-actions[bot] about 2 years ago
py-alpaca-eval - Release v0.5.1
What's Changed
- [BUG] fix no OAI org id set by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/200
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.0...v0.5.1
- Jupyter Notebook
Published by github-actions[bot] about 2 years ago
py-alpaca-eval - Release v0.5.0
What's Changed
- Fix mssg check by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/174
- Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/176
- Add 01-ai/Yi-34B-Chat to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/175
- feat: add way to verify results by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/177
- show img in readme by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/178
- Add PairRM best-of-16 to AlpacaEval by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/181
- Verify Yi by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/182
- chore: add phi-2 sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/184
- add cut-13b by @wwxu21 in https://github.com/tatsu-lab/alpaca_eval/pull/186
- chore: add phi-2 dpo by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/185
- Support phi2, Support SOLAR 10.7B LMCocktail by @yhyu13 in https://github.com/tatsu-lab/alpaca_eval/pull/183
- Update openai.py by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/188
- chore: add link for phi-2-sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/190
- chore: fix links by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/191
- Add deita-7b-v1.0 model by @VPeterV in https://github.com/tatsu-lab/alpaca_eval/pull/192
- [ENH] Azure OAI client & more general way of switching between client configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/193
- [ENH] Weighted win rates by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/189
- [ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/195
- [ENH] alpacaeval 2.0 by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/196
New Contributors
- @Muennighoff made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/174
- @HyperdriveHustle made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/175
- @jdf-prog made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/181
- @lxuechen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/184
- @wwxu21 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/186
- @yhyu13 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/183
- @VPeterV made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/192
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.6...v0.5.0
- Jupyter Notebook
Published by github-actions[bot] about 2 years ago
py-alpaca-eval - Release v0.3.6
What's Changed
- feat: verify all the cohere model & use it as eval by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/170
- Add Tulu 2 models to AlpacaEval by @hamishivi in https://github.com/tatsu-lab/alpaca_eval/pull/171
New Contributors
- @hamishivi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/171
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.5...v0.3.6
- Jupyter Notebook
Published by github-actions[bot] about 2 years ago
py-alpaca-eval - Release v0.3.5
What's Changed
- [WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
- [ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
- Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
- fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169
New Contributors
- @GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...v0.3.5
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release vv0.3.4
What's Changed
- [WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
- [ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
- Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
- fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169
New Contributors
- @GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...vv0.3.4
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.3.3
What's Changed
- Gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/159
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.2...v0.3.3
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.3.2
What's Changed
- add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval by @lifan-yuan in https://github.com/tatsu-lab/alpaca_eval/pull/139
- Add annotations & fix leaderboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/142
- refresh Cohere by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/141
- Add PlatoLM-7B to AlpacaEval by @renatz in https://github.com/tatsu-lab/alpaca_eval/pull/143
- Add evo-7b to AlpacaEval by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/144
- Add NEFTune models to AlpacaEval by @neelsjain in https://github.com/tatsu-lab/alpaca_eval/pull/146
- Add claude2-alpaca-13b, recycled-wizardlm-7b-v1.0, recycled-wizardlm-… by @MingLiiii in https://github.com/tatsu-lab/alpaca_eval/pull/147
- Add CausalLM/14B to AlpacaEval by @CausalLM in https://github.com/tatsu-lab/alpaca_eval/pull/148
- Add Zephyr 7B evals by @lewtun in https://github.com/tatsu-lab/alpaca_eval/pull/152
- Add Evo v2 7B by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/153
- Add decoder for calling Anthropic models via Amazon Bedrock by @billcai in https://github.com/tatsu-lab/alpaca_eval/pull/151
- cohere update by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/155
- feat: upgrade to openai 1.0.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/157
New Contributors
- @lifan-yuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/139
- @renatz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/143
- @zfang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/144
- @neelsjain made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/146
- @MingLiiii made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/147
- @CausalLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/148
- @lewtun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/152
- @billcai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/151
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.1...v0.3.2
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.3.1
What's Changed
- Add results of Xwin-LM by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/135
- [ENH] add gpt 3.5 instruct by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/137
New Contributors
- @nbl97 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/135
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.0...v0.3.1
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.3.0
What's Changed
- [ENH] add fixed gpt4 version annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/127
- Add openbuddy-llama2-13b-v11.1 by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/129
- [ENH] add max concurrency oai by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/131
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.9...v0.3.0
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.9
What's Changed
- Ensure primary keys are string & decrease processes for OpenAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/116
- Add JinaChat to the leaderboards by @jupyterjazz in https://github.com/tatsu-lab/alpaca_eval/pull/117
- [BUG] jina chat error in configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/118
- Add Humpback to AlpacaEval by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/120
- update Humpback results by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/121
- add link to Humpback paper by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/122
- Add
vllmdecoder for model inference by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/124 - [ENH] return
completions_alland allow sequence of maxtokens by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/125
New Contributors
- @jupyterjazz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/117
- @xianxl made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/120
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.8...v0.2.9
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.8
What's Changed
- [BUG] closes #77 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/109
- Add openbuddy-llama-30b-v7.1 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/108
- Fix typo on prettyname by @44670 in https://github.com/tatsu-lab/alpacaeval/pull/110
- Add openbuddy-falcon-40b-v9 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/111
- [CLEAN] remove warning by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/112
- [BUG] utils.DUMMYEXAMPLE by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/113
New Contributors
- @44670 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/108
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.7...v0.2.8
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.7
What's Changed
- Update WizardLM 13B V1.2 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/99
- [ENH] llama70B and chunking by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/100
- [ENH] add pipeline meta parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/103
- [CLEAN] Single annotator not abstract by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/104
- Add OpenChat 3.1 Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/105
- [ENH] add example with HF API by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/106
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.6...v0.2.7
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.6
What's Changed
- [STYLE] fix ill-formatted logging message by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/97
- [STYLE] PR medium eval (ANNOTATORCOLUMN) by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/98
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.5...v0.2.6
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.5
What's Changed
- [ENH] adds processors by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/95
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.4...v0.2.5
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.4
What's Changed
- Add Baichuan-13B-Chat Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/85
- Add ChatGLM2-6B Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/86
- [ENH] add chat llama2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/87
- [ENH] automatically add minimal/verified by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/88
- [ENH] add replicate + llama 70B by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/90
- [ENH] add llama 70B outputs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/91
- [ENH] optionally return raw completions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/92
- [ENH] evalparser by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/93
- [ENH] json parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/94
New Contributors
- @inferLLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/85
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.3...v0.2.4
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.3
What's Changed
- [ENH] make completionparser easier to inherit by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/81
- [ENH] Add length by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/79
- [ENH] add formatsamplesheets.py to CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/82
- [ENH] adding samples to leadeboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/83
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.2...v0.2.3
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.2
What's Changed
- [ENH] add base annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/76
- [ENH] add claude v2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/78
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.1...v0.2.2
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.1
What's Changed
- Update WizardLM 13B V1.1 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/66
- [ENH] make. it easier to cache to a DB by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/73
- add vicuna v1.3 results by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/74
- gpt4 annotations for vicuna v1.3 by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/75
New Contributors
- @victorsungo made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/66
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.0...v0.2.1
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - Release v0.2.0
What's Changed
- [CI] auto release by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/72
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.9...v0.2.0
- Jupyter Notebook
Published by github-actions[bot] over 2 years ago
py-alpaca-eval - v0.1.7
What's Changed
- Add Custom OpenAI API Endpoint Support and OpenChat Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/42
- get falcon models running decoding by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/47
- [TEST] test by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/50
- [ENH] upgrade anthropic 0.3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/54
- [CLEAN] black by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/55
- [TEST] setting up test CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/56
- Add Baize v2 13B by @JetRunner in https://github.com/tatsu-lab/alpaca_eval/pull/49
- [CI] leaderboard formatting by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/58
- format leaderboard for baize by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/59
- [ENH] remove inputs from example by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/60
- [CLEAN] setting up precommit by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/61
New Contributors
- @imoneoi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/42
- @JetRunner made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/49
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.6...v0.1.7.1
- Jupyter Notebook
Published by YannDubs over 2 years ago
py-alpaca-eval - v0.1.5
Add more accessible chatgpt_fn annotator
- Jupyter Notebook
Published by rtaori over 2 years ago
py-alpaca-eval - v0.1.3
Update requirements to python 3.10
- Jupyter Notebook
Published by rtaori over 2 years ago