Recent Releases of py-alpaca-eval

py-alpaca-eval - Release v0.6.6

What's Changed

  • [ENH] add strict decoding OAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/394
  • Add blendaxai-gm-l6-vo31 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/399
  • Added Llama3-PBM-Nova-70B model by @PKU-Baichuan in https://github.com/tatsu-lab/alpaca_eval/pull/395
  • Add evaluator weightedalpacaevalgpt-4o-mini-2024-07-18 by @tongyx361 in https://github.com/tatsu-lab/alpacaeval/pull/401
  • Add Shopee-SlimMoA-v1 to AlpacaEval by @LLM-Alignment-sh in https://github.com/tatsu-lab/alpaca_eval/pull/398
  • [ENH] add metadata to completion: date, version,... by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/402
  • Add REBEL-Llama-3-8B-Instruct-Armo to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/403
  • Add Llama-3-8B-Instruct-SkillMix to AlpacaEval by @parksimon0808 in https://github.com/tatsu-lab/alpaca_eval/pull/405
  • Updated HF Link in modelconfigs for Llama-3-8B-Instruct-SkillMix by @parksimon0808 in https://github.com/tatsu-lab/alpacaeval/pull/409
  • Add SelfMoAgemma-2-9b-it-SimPO, SelfMoAgemma-2-9b-it-WPO-HB to AlpacaEval by @wenzhe-li in https://github.com/tatsu-lab/alpaca_eval/pull/411
  • add Self-taught-llama3.1-70B-dpo as a evaluator by @tianlu-wang in https://github.com/tatsu-lab/alpaca_eval/pull/412
  • Add GPO-Llama-3-8B-Instruct-GPM-2B and SPPO-Llama-3-8B-Instruct-GPM-2… by @xukp20 in https://github.com/tatsu-lab/alpaca_eval/pull/413
  • Add NullModel to AlpacaEval by @xszheng2020 in https://github.com/tatsu-lab/alpaca_eval/pull/414
  • Add Llama-3-Instruct-8B-RainbowPO to AlpacaEval by @hanyang1999 in https://github.com/tatsu-lab/alpaca_eval/pull/416
  • add example for Llama3 vllm server by @cameron-chen in https://github.com/tatsu-lab/alpaca_eval/pull/404
  • Add FuseChat-3.0 models to AlpacaEval by @yangzy39 in https://github.com/tatsu-lab/alpaca_eval/pull/426
  • Add TOA to AlpacaEval by @oceanypt in https://github.com/tatsu-lab/alpaca_eval/pull/428
  • [BUG] toolcalls by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/429

New Contributors

  • @PKU-Baichuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/395
  • @LLM-Alignment-sh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/398
  • @parksimon0808 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/405
  • @wenzhe-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/411
  • @tianlu-wang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/412
  • @xukp20 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/413
  • @xszheng2020 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/414
  • @hanyang1999 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/416
  • @cameron-chen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/404
  • @yangzy39 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/426
  • @oceanypt made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/428

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.5...v0.6.6

- Jupyter Notebook
Published by github-actions[bot] about 1 year ago

py-alpaca-eval - Release v0.6.5

What's Changed

  • Add Llama-3-Instruct-8B-WPO-HB-v2 to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/377
  • [ENH] add llama 3.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/378
  • [ENH] add example for LLama 3 vllm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/381
  • Add Infinity-Instruct-7M-0729-Llama31-70B, Infinity-Instruct-7M-0729-Llama31-8B, Infinity-Instruct-7M-0729-mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/383
  • Add gemma-2-9b-it-WPO-HB to AlpacaEval by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/384
  • Add link to gemma-2-9b-it-WPO-HB by @wzhouad in https://github.com/tatsu-lab/alpaca_eval/pull/385
  • Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/387
  • Add blendaxai-gm-l3-v35 to AlpacaEval by @ym-blendax-ai in https://github.com/tatsu-lab/alpaca_eval/pull/389
  • [ENH] OpenAI use tools instead of functions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/391
  • [ENH] enable basedir to be a list by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/392
  • [ENH] add mistral v0.3, Qwen2 70b, gtp4 mini by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/393

New Contributors

  • @wzhouad made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/377
  • @ym-blendax-ai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/389

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.4...v0.6.5

- Jupyter Notebook
Published by github-actions[bot] over 1 year ago

py-alpaca-eval - Release v0.6.4

What's Changed

  • Add SPPO-Llama-3-Instruct-8B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/354
  • Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/358
  • Add SPPO-Gemma-2-9B-It-PairRM to AlpacaEval by @angelahzyuan in https://github.com/tatsu-lab/alpaca_eval/pull/359
  • Add Infinity-Instruct-3M-0625-Models to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/364
  • Add Higgs Llama3-70B V2 Results by @sxjscience in https://github.com/tatsu-lab/alpaca_eval/pull/367
  • Added Ghost 8B Beta (d0x5) model by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/366
  • Add gemma-2-9b-it-SimPO and gemma-2-9b-it-DPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/368
  • [ENH] add CI test for unwanted files by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/369
  • update model links by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/370
  • [ENH] add the code to compute instructionfollowing by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/371
  • [ENH] adding simplified glm by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/372
  • [BUG] backward compatibility vllm dosample -> usebeamsearch by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/373

New Contributors

  • @angelahzyuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/359
  • @sxjscience made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/367

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.3...v0.6.4

- Jupyter Notebook
Published by github-actions[bot] over 1 year ago

py-alpaca-eval - Release v0.6.3

What's Changed

  • Add the evaluation result for our latest model by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/286
  • Add Ghost 7B Alpha to AlpacaEval by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/288
  • Add link for FsfairX-Zephyr-Chat-v0.1 by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/289
  • add Qwen1.5-110B-Chat self-report results by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/291
  • [ENH] verifying all the qwens by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/292
  • Enable analyzing evaluators/annotators on data without multiple generator models by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/293
  • Add Storm-7B to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/294
  • Use verified by default by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/297
  • Add SPPO-Mistral7B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/298
  • Add ExPO results to AlpacaEval by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/299
  • Fix typo in README.md by @tongyx361 in https://github.com/tatsu-lab/alpaca_eval/pull/302
  • Add Yi-Large Preview to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/304
  • "Add Mistral-7B+RAHF-DUAL+LoRA to AlpacaEval" by @LiuAmber in https://github.com/tatsu-lab/alpaca_eval/pull/307
  • [verified] Yi-large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/309
  • [ADD] GPT4-o by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/311
  • [ENH] add LC SEM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/317
  • llama3 evaluator by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/314
  • Update README.md by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/315
  • [CLEAN] move evaluators lb llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/318
  • [ENH] vicuna 1.5 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/319
  • Add Llama-3-Instruct-8B-SimPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/320
  • [ENH] Use multi threading instead of processing by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/321
  • Add Aligner 2B+GPT-4 Turbo (04/09) Results by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/324
  • Add REBEL-Llama-3-8B-Instruct to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/326
  • [ENH&BUG] improve VLLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/330
  • Add ExPO + Llama-3-Instruct-8B-SimPO results by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/331
  • fix model link by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/332
  • Add merlinite-7B-AOT to AlpacaEval by @imelnyk in https://github.com/tatsu-lab/alpaca_eval/pull/334
  • [BUG] fix bs in VLLM and add chatml by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/338
  • Add Together-MoA, Together-MoA-Lite to AlpacaEval by @IsThatYou in https://github.com/tatsu-lab/alpaca_eval/pull/342
  • Add Nanbeige2-16B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/345
  • Add claude-3-5-sonnet-20240620 to AlpacaEval by @MarjovanLier in https://github.com/tatsu-lab/alpaca_eval/pull/348
  • [BUG] trust repo alpacaeval by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/349
  • Add OpenPipe Mixture of Agents model to Alpaca Eval by @saum7800 in https://github.com/tatsu-lab/alpaca_eval/pull/347
  • Add Storm-7B, Storm-7B (best-of-64) to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/344
  • Add Infinity-Instruct-3M-0613-Mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/351

New Contributors

  • @hendrydong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/286
  • @lh0x00 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/288
  • @yifan123 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/294
  • @Edward-Sun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/298
  • @chujiezheng made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/299
  • @tongyx361 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/302
  • @LiuAmber made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/307
  • @zhuang-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/314
  • @xiamengzhou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/320
  • @ZhaolinGao made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/326
  • @imelnyk made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/334
  • @IsThatYou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/342
  • @MarjovanLier made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/348
  • @saum7800 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/347
  • @cszhengyh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/351

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.2...v0.6.3

- Jupyter Notebook
Published by github-actions[bot] over 1 year ago

py-alpaca-eval - Release v0.6.2

What's Changed

  • [BUG] backward compatibility with AF by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/278
  • Add Nanbeige-Plus-Chat-v0.1 to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/279
  • Update README.md by @Dominic789654 in https://github.com/tatsu-lab/alpaca_eval/pull/280
  • [BUG] revert to GPT4 preview 1106 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/283
  • Add support for analyzing evaluators with custom cross-annotations by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/281
  • [ENH] llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/285

New Contributors

  • @Dominic789654 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/280
  • @rdnfn made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/281

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.1...v0.6.2

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.6.1

What's Changed

  • Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/259
  • Supplement for Aligner by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/261
  • Add Ein-70B-v0.1 to AlpacaEval by @bin-bi in https://github.com/tatsu-lab/alpaca_eval/pull/262
  • Add TempNet-LLaMA2-Chat to AlpacaEval by @xumao-nju in https://github.com/tatsu-lab/alpaca_eval/pull/264
  • Add Conifer-7B-DPO to AlpacaEval by @liulixin29 in https://github.com/tatsu-lab/alpaca_eval/pull/267
  • Updating link to a super fast demo! by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/268
  • Add Nanbeige2-8B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/274
  • [ENH] adding drbx and gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/275

New Contributors

  • @AlignInc made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/259
  • @bin-bi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/262
  • @xumao-nju made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/264
  • @liulixin29 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/267
  • @yuani114 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/274

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6...v0.6.1

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.6

What's Changed

  • [DATA] Add Gemma by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/242
  • [NOTEBOOK] adding final length correction notebook. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/244
  • add Mistral-7B-ReMax-v0.1 by @liziniu in https://github.com/tatsu-lab/alpaca_eval/pull/245
  • [ENH] add claude 3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/247
  • [ENH] add contextual by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/250
  • [ENH] add mistral large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/251
  • Add Samba-CoE-v0.2 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/253
  • Add Samba-CoE-v0.2-best-of-16 to AlpacaEval by @kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/256
  • Add Mistral-ORPO-Beta to AlpacaEval by @jiwooya1000 in https://github.com/tatsu-lab/alpaca_eval/pull/257
  • Yann/length correction by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/258

New Contributors

  • @liziniu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/245
  • @kyleliang919 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/253
  • @jiwooya1000 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/257

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.4...v0.6

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.5.4

What's Changed

  • Add Qwen1.5-72B-Chat to AlpacaEval by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/226
  • Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/228
  • [DATA] Adding annotations for the arena models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/229
  • Update README.md - Add missing "Y" to "ou" by @yoderj in https://github.com/tatsu-lab/alpaca_eval/pull/230
  • [DEV] Analyzing length-controlled metrics. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/231
  • [DOC] add annotation interpretation by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/232
  • [DATA] add results from the Arena openai models by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/234
  • update ELO for llama-2-13b-chat-hf by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/235
  • [NOTEBOOK] add length-corrected GLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/237
  • [ENH] add inverse mapper to make sure in and out types are the same by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/240
  • [ENH] update to allow AF to use AE by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/241

New Contributors

  • @Lukeming-tsinghua made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/226
  • @yoderj made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/230

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.3...v0.5.4

- Jupyter Notebook
Published by github-actions[bot] almost 2 years ago

py-alpaca-eval - Release v0.5.3

What's Changed

  • [ENH] add mistral-medium by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/205
  • [ENH] add internlm2-chat-20b-ppo by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/207
  • prettify "prettyname" of internlm2 by @C1rN09 in https://github.com/tatsu-lab/alpacaeval/pull/208
  • [ENH] add outputs & configs form dolphin 2.2.1 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/209
  • Add PairRM 0.4B + Yi-34B-Chat to AlpacaEval 2.0 by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/210
  • dolphin 2.1.1 configs.yaml by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/212
  • Update README.md (small typo) by @xwinxu in https://github.com/tatsu-lab/alpaca_eval/pull/213
  • [TEST]: fix ordering of df by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/214
  • Add Snorkel-Mistral-PairRM-DPO (best-of-16) to Alpaca Eval 2.0 by @viethoangtranduong in https://github.com/tatsu-lab/alpaca_eval/pull/215
  • update InternLM2 chat template by @C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/216
  • Add Starling-LM-7B-alpha, vicuna-13b-v1.5, vicuna-7b-v1.5 to AlpacaEval (config + outputs without annotations) by @gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/217
  • [RES] add 3 models for arena correlations by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/218
  • Add xwinlm-70b-v0.3 to AlpacaEval by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/221
  • [ENH] add referencedmodels locally by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/224

New Contributors

  • @C1rN09 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/207
  • @gblazex made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/212
  • @xwinxu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/213
  • @viethoangtranduong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/215

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.2...v0.5.3

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.5.2

What's Changed

  • [BUG] force openai >1.5.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/202
  • [WIP] precompute all leaderboard for AE2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/199
  • [ENH] add OpenHermes by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/203

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.1...v0.5.2

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.5.1

What's Changed

  • [BUG] fix no OAI org id set by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/200

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.0...v0.5.1

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.5.0

What's Changed

  • Fix mssg check by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/174
  • Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/176
  • Add 01-ai/Yi-34B-Chat to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/175
  • feat: add way to verify results by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/177
  • show img in readme by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/178
  • Add PairRM best-of-16 to AlpacaEval by @jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/181
  • Verify Yi by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/182
  • chore: add phi-2 sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/184
  • add cut-13b by @wwxu21 in https://github.com/tatsu-lab/alpaca_eval/pull/186
  • chore: add phi-2 dpo by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/185
  • Support phi2, Support SOLAR 10.7B LMCocktail by @yhyu13 in https://github.com/tatsu-lab/alpaca_eval/pull/183
  • Update openai.py by @Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/188
  • chore: add link for phi-2-sft by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/190
  • chore: fix links by @lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/191
  • Add deita-7b-v1.0 model by @VPeterV in https://github.com/tatsu-lab/alpaca_eval/pull/192
  • [ENH] Azure OAI client & more general way of switching between client configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/193
  • [ENH] Weighted win rates by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/189
  • [ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/195
  • [ENH] alpacaeval 2.0 by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/196

New Contributors

  • @Muennighoff made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/174
  • @HyperdriveHustle made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/175
  • @jdf-prog made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/181
  • @lxuechen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/184
  • @wwxu21 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/186
  • @yhyu13 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/183
  • @VPeterV made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/192

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.6...v0.5.0

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.3.6

What's Changed

  • feat: verify all the cohere model & use it as eval by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/170
  • Add Tulu 2 models to AlpacaEval by @hamishivi in https://github.com/tatsu-lab/alpaca_eval/pull/171

New Contributors

  • @hamishivi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/171

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.5...v0.3.6

- Jupyter Notebook
Published by github-actions[bot] about 2 years ago

py-alpaca-eval - Release v0.3.5

What's Changed

  • [WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
  • [ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
  • Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
  • fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169

New Contributors

  • @GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...v0.3.5

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release vv0.3.4

What's Changed

  • [WIP] GPT4 turbo as evaluator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/160
  • [ENH] add GPT4 turbo as evaluator in README by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/165
  • Add minichat-3b to AlpacaEval by @GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/167
  • fix: filter openai spam filter by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/169

New Contributors

  • @GeneZC made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/167

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...vv0.3.4

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.3

What's Changed

  • Gpt4 turbo by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/159

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.2...v0.3.3

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.2

What's Changed

  • add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval by @lifan-yuan in https://github.com/tatsu-lab/alpaca_eval/pull/139
  • Add annotations & fix leaderboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/142
  • refresh Cohere by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/141
  • Add PlatoLM-7B to AlpacaEval by @renatz in https://github.com/tatsu-lab/alpaca_eval/pull/143
  • Add evo-7b to AlpacaEval by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/144
  • Add NEFTune models to AlpacaEval by @neelsjain in https://github.com/tatsu-lab/alpaca_eval/pull/146
  • Add claude2-alpaca-13b, recycled-wizardlm-7b-v1.0, recycled-wizardlm-… by @MingLiiii in https://github.com/tatsu-lab/alpaca_eval/pull/147
  • Add CausalLM/14B to AlpacaEval by @CausalLM in https://github.com/tatsu-lab/alpaca_eval/pull/148
  • Add Zephyr 7B evals by @lewtun in https://github.com/tatsu-lab/alpaca_eval/pull/152
  • Add Evo v2 7B by @zfang in https://github.com/tatsu-lab/alpaca_eval/pull/153
  • Add decoder for calling Anthropic models via Amazon Bedrock by @billcai in https://github.com/tatsu-lab/alpaca_eval/pull/151
  • cohere update by @sanderland in https://github.com/tatsu-lab/alpaca_eval/pull/155
  • feat: upgrade to openai 1.0.0 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/157

New Contributors

  • @lifan-yuan made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/139
  • @renatz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/143
  • @zfang made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/144
  • @neelsjain made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/146
  • @MingLiiii made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/147
  • @CausalLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/148
  • @lewtun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/152
  • @billcai made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/151

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.1...v0.3.2

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.1

What's Changed

  • Add results of Xwin-LM by @nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/135
  • [ENH] add gpt 3.5 instruct by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/137

New Contributors

  • @nbl97 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/135

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.0...v0.3.1

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.3.0

What's Changed

  • [ENH] add fixed gpt4 version annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/127
  • Add openbuddy-llama2-13b-v11.1 by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/129
  • [ENH] add max concurrency oai by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/131

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.9...v0.3.0

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.9

What's Changed

  • Ensure primary keys are string & decrease processes for OpenAI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/116
  • Add JinaChat to the leaderboards by @jupyterjazz in https://github.com/tatsu-lab/alpaca_eval/pull/117
  • [BUG] jina chat error in configs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/118
  • Add Humpback to AlpacaEval by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/120
  • update Humpback results by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/121
  • add link to Humpback paper by @xianxl in https://github.com/tatsu-lab/alpaca_eval/pull/122
  • Add vllm decoder for model inference by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/124
  • [ENH] return completions_all and allow sequence of maxtokens by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/125

New Contributors

  • @jupyterjazz made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/117
  • @xianxl made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/120

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.8...v0.2.9

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.8

What's Changed

  • [BUG] closes #77 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/109
  • Add openbuddy-llama-30b-v7.1 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/108
  • Fix typo on prettyname by @44670 in https://github.com/tatsu-lab/alpacaeval/pull/110
  • Add openbuddy-falcon-40b-v9 to AlpacaEval by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/111
  • [CLEAN] remove warning by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/112
  • [BUG] utils.DUMMYEXAMPLE by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/113

New Contributors

  • @44670 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/108

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.7...v0.2.8

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.7

What's Changed

  • Update WizardLM 13B V1.2 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/99
  • [ENH] llama70B and chunking by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/100
  • [ENH] add pipeline meta parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/103
  • [CLEAN] Single annotator not abstract by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/104
  • Add OpenChat 3.1 Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/105
  • [ENH] add example with HF API by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/106

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.6...v0.2.7

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.6

What's Changed

  • [STYLE] fix ill-formatted logging message by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/97
  • [STYLE] PR medium eval (ANNOTATORCOLUMN) by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/98

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.5...v0.2.6

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.5

What's Changed

  • [ENH] adds processors by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/95

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.4...v0.2.5

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.4

What's Changed

  • Add Baichuan-13B-Chat Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/85
  • Add ChatGLM2-6B Results by @inferLLM in https://github.com/tatsu-lab/alpaca_eval/pull/86
  • [ENH] add chat llama2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/87
  • [ENH] automatically add minimal/verified by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/88
  • [ENH] add replicate + llama 70B by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/90
  • [ENH] add llama 70B outputs by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/91
  • [ENH] optionally return raw completions by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/92
  • [ENH] evalparser by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/93
  • [ENH] json parser by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/94

New Contributors

  • @inferLLM made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/85

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.3...v0.2.4

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.3

What's Changed

  • [ENH] make completionparser easier to inherit by @YannDubs in https://github.com/tatsu-lab/alpacaeval/pull/81
  • [ENH] Add length by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/79
  • [ENH] add formatsamplesheets.py to CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/82
  • [ENH] adding samples to leadeboard by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/83

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.2...v0.2.3

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.2

What's Changed

  • [ENH] add base annotator by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/76
  • [ENH] add claude v2 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/78

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.1...v0.2.2

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.1

What's Changed

  • Update WizardLM 13B V1.1 results by @victorsungo in https://github.com/tatsu-lab/alpaca_eval/pull/66
  • [ENH] make. it easier to cache to a DB by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/73
  • add vicuna v1.3 results by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/74
  • gpt4 annotations for vicuna v1.3 by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/75

New Contributors

  • @victorsungo made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/66

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.0...v0.2.1

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - Release v0.2.0

What's Changed

  • [CI] auto release by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/72

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.9...v0.2.0

- Jupyter Notebook
Published by github-actions[bot] over 2 years ago

py-alpaca-eval - v0.1.7

What's Changed

  • Add Custom OpenAI API Endpoint Support and OpenChat Results by @imoneoi in https://github.com/tatsu-lab/alpaca_eval/pull/42
  • get falcon models running decoding by @rtaori in https://github.com/tatsu-lab/alpaca_eval/pull/47
  • [TEST] test by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/50
  • [ENH] upgrade anthropic 0.3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/54
  • [CLEAN] black by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/55
  • [TEST] setting up test CI by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/56
  • Add Baize v2 13B by @JetRunner in https://github.com/tatsu-lab/alpaca_eval/pull/49
  • [CI] leaderboard formatting by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/58
  • format leaderboard for baize by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/59
  • [ENH] remove inputs from example by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/60
  • [CLEAN] setting up precommit by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/61

New Contributors

  • @imoneoi made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/42
  • @JetRunner made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/49

Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.1.6...v0.1.7.1

- Jupyter Notebook
Published by YannDubs over 2 years ago

py-alpaca-eval - v0.1.6

- Jupyter Notebook
Published by rtaori over 2 years ago

py-alpaca-eval - v0.1.5

Add more accessible chatgpt_fn annotator

- Jupyter Notebook
Published by rtaori over 2 years ago

py-alpaca-eval - v0.1.3

Update requirements to python 3.10

- Jupyter Notebook
Published by rtaori over 2 years ago

py-alpaca-eval - v0.1.1

- Jupyter Notebook
Published by lxuechen over 2 years ago

py-alpaca-eval - v0.1.0

- Jupyter Notebook
Published by lxuechen over 2 years ago