Releases | Open Source Science

lazyllm-llamafactory - v0.9.3: Llama4, Gemma3, Qwen3, InternVL3, Qwen2.5-Omni

We will attend the AWS Summit Shanghai 2025 on June 20th! See you in Shanghai 👋

Event info: https://aws.amazon.com/cn/events/summits/shanghai/

New features

🔥 InternVL2.5/InternVL3 model by @Kuangdd01 in #7258
🔥 Qwen2.5-Omni model by @Kuangdd01 in #7537
🔥 Llama 4 and Gemma 3 multimodal model by @hiyouga in #7273 and #7611
🔥 Official GPU docker image by @yzoaim in #8181
🔥 SGLang inference by @Qiaolin-Yu and @jhinpan in #7278
GLM-4-0414 and GLM-Z1 model by @zRzRzRzRzRzRzR in #7695
Kimi-VL model by @Kuangdd01 in #7719
Qwen3 model by @hiyouga in #7885
MiMo and MiMo-VL model by @Kuangdd01 in #7946 #8249
SmolLM/SmolLM2 model by @akshatsehgal in #8050 #8220
MiniCPM4 model by @LDLINGLINGLING in #8314
Mistral-Small-3.1 model by @Kuangdd01 in #8335
Add scripts/eval_bleu_rouge.py by @SnowFox4004 in #7419
Add Muon optimizer by @tianshijing in #7749
Support video/audio inference with vLLM by @hiyouga in #7566
Support S3/GCS cloud data by @erictang000 in #7567
Support vLLM-ascend by @leo-pony in #7739
Support OmegaConf by @hiyouga in #7793
Support early-stopping by @hiyouga in #7797
Add enable_thinking argument for reasoning models by @hiyouga in #7928
PyTorch-elastic and fault-tolerant launch by @hubutui in #8286
Length Desensitization DPO (LD-DPO) by @amangup in #8362

New models

Base models
- SmolLM/SmolLM2 (135M/360M/1.7B) 📄
- Qwen3 Base (0.6B/1.7B/4B/8B/14B/30B) 📄
- Gemma 3 (1B/4B/12B/27B) 📄🖼️
- MedGemma (4B) 📄🩺
- MiMo Base (7B) 📄
- Seed-Coder Base (8B) 📄⌨️
- Mistral-Small-3.1 Base (24B) 📄🖼️
- GLM-4-0414 Base (32B) 📄
- Llama 4 (109B/492B) 📄🖼️
Instruct/Chat models
- SmolLM/SmolLM2 Instruct (135M/360M/1.7B) 📄🤖
- MiniCPM4 (0.5B/8B) 📄🤖
- Qwen3 (0.6B/1.7B/4B/8B/14B/32B/30B/235B) 📄🤖🧠
- Gemma 3 Instruct (1B/4B/12B/27B) 📄🤖🖼️
- InternVL2.5/3 Instruct/MPO (1B/2B/8B/14B/38B/78B) 📄🤖🖼️
- Qwen2.5-Omni (3B/7B) 📄🤖🖼️🔈
- MedGemma Instruct (4B/27B) 📄🤖🩺
- MiMo SFT/RL (7B) 📄🤖
- MiMo-VL SFT/RL (7B) 📄🤖🖼️
- Hunyuan Instruct (7B) 📄🤖
- Seed-Coder Instruct/Reasoning (8B) 📄🤖🧠⌨️
- GLM-4-0414/GLM-Z1 Instruct (9B/32B) 📄🤖🧠
- DeepSeek-R1-0528 (8B/671B) 📄🤖🧠
- Kimi-VL Instruct/Thinking (17B) 📄🤖🧠🖼️
- Mistral-Small-3.1 Instruct (24B) 📄🤖🖼️
- Qwen2.5-VL Instruct (32B) 📄🤖🖼️
- Llama 4 Instruct (109B/492B) 📄🤖🖼️

New datasets

Preference datasets
- COIG-P (zh) 📄

Bug fix

Fix add new tokens by @flashJd in #7253
Fix ultrachat_200k dataset by @felladrin in #7259
Add efficient 4D attention mask for neat packing by @BlackWingedKing in #7272
Fix WSD lr scheduler by @x22x22 in #7304
Fix position ids in neat packing by @BlackWingedKing in #7318
Fix proxy setting in webui by @taoharry in #7332
Improve entrypoint by @ENg-122 in #7345
Fix ray destroy process group by @erictang000 in #7395
Fix SGLang dependencies by @guoquan in #7432
Upgrade docker package version by @rumichi2210 in #7442
Update liger kernel for qwen2.5-vl by @xiaosu-zhu in #7453
Fix lora on quant models by @GuoCoder in #7456
Enable liger kernel for gemma3 by @kennylam777 in #7462
Enable liger kernel for paligemma by @eljandoubi in #7466
Add Swanlab lark notification by @Xu-pixel in #7481
Fix gemma3 use cache attribute by @ysjprojects in #7500
Fix pixtral plugin by @Kuangdd01 in #7505
Fix KTO mismatch pair strategy by @himalalps in #7509
Support dataset_shards by @aliencaocao in #7530
Fix qwen2.5omni plugin by @Kuangdd01 in #7573 #7578 #7883
Fix ppo trainer by @gechengze in #7576
Fix workflow by @Shawn-Tao in #7635
Support qwen2.5omni audio+video2text by @Kuangdd01 in #7638
Upgrade deps for SGLang by @adarshxs in #7639
Allow ray env setting by @erictang000 in #7647
Fix CUDA warning on intel xpus by @jilongW in #7655
Fix liger kernel patch by @danny980521 in #7660
Fix rocm dockerfile by @fluidnumerics-joe in #7725
Fix qwen2vl with neat packing by @GeoffreyChen777 in #7754
Fix a constant by @AlphaBladez in #7765
Fix autogptq for Gemma by @ddddng in #7786
Fix internvl models by @Kuangdd01 in #7801 #7803 #7817 #8129
Fix DeepSpeed ZeRO3 on moe models by @hiyouga in #7826 #7879
Fix gradient checkpoint func for vit by @hiyouga in #7830
Support S3 ray storage by @erictang000 in #7854
Fix Kimi-VL attention by @Kuangdd01 in #7867
Fix minicpm-o vllm inference by @hiyouga in #7870
Unfreeze muiltimodal projector in freeze training by @zhaop-l in #7872
Fix Qwen2.5-omni plugin by @hiyouga in #7875 #7962
Add warp support link by @ericdachen in #7887
Replace eos token for base model by @hiyouga in #7911
Add eval_on_each_dataset arg by @hiyouga in #7912
Fix qwen3 loss by @hiyouga in #7923 #8109
Add repetition_penalty to api by @wangzhanxd in #7958
Add graphgen to readme by @tpoisonooo in #7974
Support video params in vllm batch infer by @Kuangdd01 in #7992
Fix tool formatter by @yunhao-tech in #8000
Fix kimi vl plugin by @hiyouga in #8015
Support batch preprocess in vllm batch infer by @Shawn-Tao in #8051
Support loading remote folder by @erictang000 in #8078
Fix video utils import by @Kuangdd01 in #8077
Fix SGLang LoRA inference by @Kiko-RWan in #8067
Fix cli by @Wangbiao2 in #8095
Fix pretrain workflow by @SunnyHaze in #8099
Fix rope args for yarn by @piamo in #8101
Add no build isolation in installing by @hiyouga in #8103
Switch to GPTQModel and deprecate AutoGPTQ by @hiyouga in #8108
Support llama3 parallel function call by @hiyouga in #8124
Add data_shared_file_system by @hiyouga in #8179
Fix load remote files by @youngwookim in #8183
Fix dataset info by @Muqi1029 in #8197
Fix qwen2.5 omni merge script by @Kuangdd01 in #8227 #8293
Add unittest for VLM save load by @Kuangdd01 in #8248
Add tag in swanlab by @Zeyi-Lin in #8258
Support input video frames by @Kuangdd01 in #8264
Fix empty template by @hiyouga in #8312
Support full-finetuning with unsloth by @Remorax in #8325
Add awesome work by @MING-ZCH in #8333
Release v0.9.3 by @hiyouga in #8386
Fix qwen2vl position ids by @hiyouga in #8387
Fix vlm utils by @hiyouga in #8388
Fix #3802 #4443 #5548 #6236 #6322 #6432 #6708 #6739 #6881 #6919 #7080 #7105 #7119 #7225 #7267 #7327 #7389 #7416 #7427 #7428 #7443 #7447 #7454 #7490 #7501 #7502 #7513 #7520 #7541 #7545 #7552 #7563 #7598 #7600 #7613 #7636 #7678 #7680 #7687 #7688 #7730 #7743 #7772 #7791 #7800 #7816 #7829 #7845 #7865 #7874 #7889 #7905 #7906 #7907 #7909 #7916 #7918 #7919 #7939 #7953 #7965 #7990 #8008 #8056 #8061 #8066 #8069 #8087 #8091 #8092 #8096 #8097 #8111 #8119 #8147 #8166 #8169 #8174 #8182 #8189 #8223 #8241 #8247 #8253 #8294 #8309 #8324 #8326 #8332

Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.2...v0.9.3

- Python
Published by hiyouga 12 months ago

lazyllm-llamafactory - v0.9.2: MiniCPM-o, SwanLab, APOLLO

This is the last version before LLaMA-Factory v1.0.0. We are working hard to improve the efficiency and availability.

We will attend the vLLM Beijing Meetup on Mar 16th! See you in Beijing 👋

Event info: https://mp.weixin.qq.com/s/viPRDlhnzS3qO9-96fMeeA

New features

🔥 APOLLO optimizer by @zhuhanqing in #6617
🔥 SwanLab experiment tracker by @Zeyi-Lin in #6401
🔥 Ray Trainer by @erictang000 in #6542
Batch inference with vLLM TP by @JieShenAI in #6190
QLoRA on Ascend NPU by @codemayq in #6601
Yarn and Llama3 rope scaling by @hiyouga in #6693
Support uv run by @erictang000 in #6907
Ollama modelfile auto-generation by @codemayq in #4686
Mistral tool prompt by @AlongWY in #5473
Llama3 and Qwen2 tool prompt by @hiyouga in #6367 and #6369

New models

Base models
- GPT2 (0.1B/0.4B/0.8B/1.5B) 📄
- Granite 3.0-3.1 (1B/2B/3B/8B) 📄
- PaliGemma2 (3B/10B/28B) 📄🖼️
- Moonlight (16B) 📄
- DeepSeek V2-V2.5 Base (236B) 📄
- DeepSeek V3 Base (671B) 📄
Instruct/Chat models
- Granite 3.0-3.1 (1B/2B/3B/8B) by @Tuyohai in #5922 📄🤖
- DeepSeek R1 (1.5B/7B/8B/14B/32B/70B/671B) by @Qwtdgh in #6767 📄🤖
- TeleChat2 (3B/7B/12B/35B/115B) @ge-xing in #6313 📄🤖
- Qwen2.5-VL (3B/7B/72B) by @hiyouga in #6779 📄🤖🖼️
- PaliGemma2-mix (3B/10B/28B) by @Kuangdd01 in #7060 📄🤖🖼️
- Qwen2 Audio (7B) by @BUAADreamer in #6701 📄🤖🔈
- MiniCPM-V/MiniCPM-o (8B) by @BUAADreamer in #6598 and #6631 📄🤖🖼️🔈
- InternLM3-Instruct (8B) by @hhaAndroid in #6640 📄🤖
- Marco-o1 (8B) 📄🤖
- Skywork-o1 (8B) 📄🤖
- Phi-4 (14B) 📄🤖
- Moonlight Instruct (16B) 📄
- Mistral Small (24B) 📄🤖
- QwQ (32B) 📄🤖
- Llama-3.3-Instruct (70B) 📄🤖
- QvQ (72B) 📄🤖🖼️
- DeepSeek V2-V2.5 (236B) 📄🤖
- DeepSeek V3 (671B) 📄🤖

New datasets

Supervised fine-tuning datasets
- OpenO1 (en) 📄
- Open Thoughts (en) 📄
- Open-R1-Math (en) 📄
- Chinese-DeepSeek-R1-Distill (zh) 📄

Changes

Refactor VLMs register by @hiyouga in #6600
Refactor mm plugin by @hiyouga in #6895
Refactor template by @hiyouga in #6896
Refactor data pipeline by @hiyouga in #6901
Update vlm arguments by @hiyouga in #6976
We have cleaned large files in git history using BFG Repo-Cleaner, find the backup repo here

Bug fix

Add trust_remote_code option by @yafshar in #5819
Fix mllama config by @hiyouga in #6137 and #6140
Fix mllama pad by @hiyouga in #6151 and #6874
Pin tokenizers version by @hiyouga in #6157
Fix tokenized data loading by @village-way in #6160
Show hostname in webui by @hykilpikonna in #6170
Fix VLMs zero3 training by @hiyouga in #6233
Add skip_special_tokens by @hiyouga in #6363
Support non-reenterent-gc by @hiyouga in #6364
Add disable_shuffling option by @hiyouga in #6388
Fix gen kwargs by @hiyouga in #6395
Enable module run by @youkaichao in #6457
Fix eval loss value by @hiyouga in #6465
Fix paligemma inference by @hiyouga in #6483
Add deepseek v3 template by @piamo in #5507
Add http proxy argument in dockerfile by @shibingli in #6462
Fix trainer generate by @hiyouga in #6512
Fix pixtral DPO training by @hiyouga in #6547
Fix ray args by @stephen-nju in #6564
Fix minicpm template by @BUAADreamer in #6620
Fix stop tokens for visual detection by @hiyouga in #6624
Pin vllm version by @hiyouga in #6629
Fix mllama any image by @hiyouga in #6637 and #7053
Fix tokenizer max length by @xiaosu-zhu in #6632
Fix webui locale by @steveepreston in #6653
Fix MiniCPM-o DPO training by @BUAADreamer in #6657
Fix Qwen2 MoE training by @hiyouga in #6684
Upgrade to gradio 5 by @hiyouga in #6688
Support Japanese local file by @engchina in #6698
Fix DPO loss by @yinpu in #6722
Webui thinking mode by @hiyouga in #6778
Upgrade to transformers 4.48 by @hiyouga in #6628
Fix ci by @hiyouga in #6787
Fix instructions about installing fa2 on win platform in readme by @neavo in #6788
Fix minicpmv plugin by @BUAADreamer in #6801, #6890, #6946 and #6998
Fix qwen2 tool prompt by @yueqis in #6796
Fix llama pro by @hiyouga in #6814
Allow thought in function call by @yueqis in #6797
Add ALLOW_EXTRA_ARGS by @hiyouga in #6831
Fix Qwen2vl plugin by @hiyouga in #6855
Upgrade vllm to 0.7.2 by @hiyouga in #6857
Fix unit test for tool using by @hiyouga in #6865
Skip broken data in sharegpt converter by @JJJYmmm in #6879
Fix qwen2.5 plugin for video by @JJJYmmm in #6868
Parsing chat template from tokenizer by @hiyouga in #6905 (experimental)
Fix mllama KTO training by @marko1616 in #6904
Fix grad checkpointing by @hiyouga in #6916 and #6931
Fix ollama template by @hiyouga in #6902
Fix ray example by @erictang000 in #6906
Improve error handling for media by @noahc1510 in #6128
Support split on each dataset by @SrWYG in #5522
Fix gen kwargs in training by @aliencaocao in #5451
Liger kernel for qwen2.5vl by @hiyouga in #6930
Fix lora target modules by @hiyouga in #6944
Add ray_storage_path by @erictang000 in #6920
Fix trainer.predict by @hiyouga in #6972
Add min resolution control by @hiyouga in #6975
Upgrade transformers to 4.49 by @hiyouga in #6982
Add seed in vllm batch predict by @JieShenAI in #7058
Fix pyproject.toml by @hiyouga in #7067
Upgrade CANN images by @leo-pony in #7061
Display swanlab link by @Zeyi-Lin in #7089
Fix hf engine by @hiyouga in #7120
Add bailing chat template by @oldstree in #7117
Use bicubic resampler instead of nearest by @hiyouga in #7143
Fix Qwen2Audio plugin by @lsrami in #7166
Destroy process group by @hiyouga in #7174
Fix swanlab callback by @Zeyi-Lin in #7176
Fix paligemma plugin by @hiyouga in #7181
Escape html tag in webui by @hiyouga in #7190
Upgrade vllm to 0.7.3 by @hiyouga in #7183 and #7193
Fix parser by @hiyouga in #7204
Fix function formatter by @zhangch-ss in #7201
Fix deepspeed config by @hiyouga in #7205
Fix dataloader by @hiyouga in #7207
Fix export tokenizer by @hiyouga in #7230
Update arguments by @hiyouga in #7231
Add swanlab_logdir by @Zeyi-Lin in #7219
Fix vllm batch prediction by @hiyouga in #7235
Avoid exit after saving tokenized data by @hiyouga in #7244
Support commit in env by @hiyouga in #7247
Release v0.9.2 by @hiyouga in #7242
Fix #1204 #3306 #3462 #5121 #5270 #5404 #5444 #5472 #5518 #5616 #5712 #5714 #5756 #5944 #5986 #6020 #6056 #6092 #6136 #6139 #6149 #6165 #6213 #6287 #6320 #6345 #6345 #6346 #6348 #6358 #6362 #6391 #6415 #6439 #6448 #6452 #6482 #6499 #6543 #6546 #6551 #6552 #6610 #6612 #6636 #6639 #6662 #6669 #6738 #6772 #6776 #6780 #6782 #6793 #6806 #6812 #6819 #6826 #6833 #6839 #6850 #6854 #6860 #6878 #6885 #6889 #6937 #6948 #6952 #6960 #6966 #6973 #6981 #7036 #7064 #7072 #7116 #7125 #7130 #7171 #7173 #7180 #7182 #7184 #7192 #7198 #7213 #7234 #7243

Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.1...v0.9.2

- Python
Published by hiyouga about 1 year ago

lazyllm-llamafactory - v0.9.1: Many Vision Models, Qwen2.5 Coder, Gradient Fix

New features

🔥Support Llama-3.2 and Llama-3.2-Vision by @marko1616 in #5547 and #5555
🔥Support LLaVA-NeXT, LLaVA-NeXT-Video and Video-LLaVA by @BUAADreamer in #5574
🔥Support Pixtral model by @Kuangdd01 in #5581
Support EXAONE3.0 by @shing100 in #5585
Support Index-series models by @Cuiyn in #5910
Support Liger-Kernel for Qwen2-VL by @aliencaocao in #5438
Support download models from ModelHub by @huniu20 in #5642
Fix abnormal loss values in transformers 4.46 by @hiyouga in #5852 #5871
Support multi-image inference by @hiyouga in #5895
Support calculating effective tokens for SFT and DPO by @wtmlon in #6078

Note: now you can install transformers>=4.46.0,<=4.46.1 to make the gradient accumulation fix enabled.

New models

Base models
- Qwen2.5 (0.5B/1.5B/3B/7B/14B/32B/72B) 📄
- Qwen2.5-Coder (0.5B/1.5B/3B/7B/14B/32B) 📄🖥️
- Llama-3.2 (1B/3B) 📄
- OpenCoder (1.5B/8B) 📄🖥️
- Index (1.9B) 📄
Instruct/Chat models
- Qwen2.5-Instruct (0.5B/1.5B/3B/7B/14B/32B/72B) 📄🤖
- Qwen2.5-Coder-Instruct (0.5B/1.5B/3B/7B/14B/32B) 📄🤖🖥️
- Llama-3.2-Instruct (1B/3B) 📄🤖
- OpenCoder-Instruct (1.5B/8B) 📄🤖🖥️
- Index-Chat (1.9B) 📄🤖
- LLaVA-NeXT (7B/8B/13B/34B/72B/110B) 📄🤖🖼️
- LLaVA-NeXT-Video (7B/34B) 📄🤖🖼️
- Video-LLaVA (7B) 📄🤖🖼️
- Pixtral (12B) 📄🤖🖼️
- EXAONE-3.0-Instruct (8B) 📄🤖

Security fix

Fix CVE-2024-52803 by @superboy-zjc in aa6a174d6822340022433c5ba38182b4932adecb

Bug fix

Update version of rocm docker by @HardAndHeavy in #5427
Fix Phi-3-small template by @menibrief in #5475
Fix function call dataset process function by @whybeyoung in #5483
Add docker args by @StrangeBytesDev in #5533
Fix logger by @chengchengpei in #5546
Fix Gemma2 flash attention warning by @amrear in #5580
Update setup by @johnnynunez in #5615 #5665
Add project by @NLPJCL in #5801
Fix saving Qwen2-VL processor by @hiyouga in #5857
Support change base image in dockerfile by @sd3ntato in #5880
Fix template replace behaviour by @hiyouga in #5907
Add image_dir argument by @hiyouga in #5909
Add rank0 logger by @hiyouga in #5912
Fix DPO metrics by @hiyouga in #5913 #6052
Update datasets version by @hiyouga in #5926
Fix chat engines by @hiyouga in #5927
Fix vllm 0.6.3 by @hiyouga in #5970
Fix extra args in llamaboard by @hiyouga in #5971
Fix vllm input args by @JJJJerry in #5973
Add vllm_config args by @hiyouga in #5982 #5990
Add shm_size in docker compose config by @XYZliang in #6010
Fix tyro version by @hiyouga in #6065
Fix ci by @hiyouga in #6120
Fix Qwen2-VL inference on vLLM by @hiyouga in #6123 #6126
Release v0.9.1 by @hiyouga in #6124
Fix #3881 #4712 #5411 #5542 #5549 #5611 #5668 #5705 #5747 #5749 #5768 #5796 #5797 #5883 #5904 #5966 #5988 #6050 #6061

Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.0...v0.9.1

- Python
Published by hiyouga over 1 year ago

lazyllm-llamafactory - v0.9.0: Qwen2-VL, Liger-Kernel, Adam-mini

Congratulations on 30,000 stars 🎉 Follow us at X (twitter)

New features

🔥Support fine-tuning Qwen2-VL model on multi-image datasets by @simonJJJ in #5290
🔥Support time&memory-efficient Liger-Kernel via the enable_liger_kernel argument by @hiyouga
🔥Support memory-efficient Adam-mini optimizer via the use_adam_mini argument by @relic-yuexi in #5095
Support fine-tuning Qwen2-VL model on video datasets by @hiyouga in #5365 and @BUAADreamer in #4136 (needs patch https://github.com/huggingface/transformers/pull/33307)
Support fine-tuning vision language models (VLMs) using RLHF/DPO/ORPO/SimPO approaches by @hiyouga
Support Unsloth's asynchronous activation offloading method via the use_unsloth_gc argument
Support vLLM 0.6.0 version
Support MFU calculation by @yzoaim in #5388

New models

Base models
- Qwen2-Math (1.5B/7B/72B) 📄🔢
- Yi-Coder (1.5B/9B) 📄
- InternLM2.5 (1.8B/7B/20B) 📄
- Gemma-2-2B 📄
- Meta-Llama-3.1 (8B/70B) 📄
Instruct/Chat models
- MiniCPM/MiniCPM3 (1B/2B/4B) by @LDLINGLINGLING in #4996 #5372 📄🤖
- Qwen2-Math-Instruct (1.5B/7B/72B) 📄🤖🔢
- Yi-Coder-Chat (1.5B/9B) 📄🤖
- InternLM2.5-Chat (1.8B/7B/20B) 📄🤖
- Qwen2-VL-Instruct (2B/7B) 📄🤖🖼️
- Gemma-2-2B-it by @codemayq in #5037 📄🤖
- Meta-Llama-3.1-Instruct (8B/70B) 📄🤖
- Mistral-Nemo-Instruct (12B) 📄🤖

New datasets

Supervised fine-tuning datasets
- Magpie-ultra-v0.1 (en) 📄
- Pokemon-gpt4o-captions (en&zh) 📄🖼️
Preference datasets
- RLHF-V (en) 📄🖼️
- VLFeedback (en) 📄🖼️

Changes

Due to compatibility consideration, fine-tuning vision language models (VLMs) requires transformers>=4.35.0.dev0, try pip install git+https://github.com/huggingface/transformers.git to install it.
visual_inputs has been deprecated, now you do not need to specify this argument.
LlamaFactory now adopts lazy loading for multimodal inputs, see #5346 for details. Please use preprocessing_batch_size to restrict the batch size in dataset pre-processing (supported by @naem1023 in #5323 ).
LlamaFactory now supports lmf (equivalent to llamafactory-cli) as a shortcut command.

Bug fix

Fix LlamaBoard export by @liuwwang in #4950
Add ROCm dockerfiles by @HardAndHeavy in #4970
Fix deepseek template by @piamo in #4892
Fix pissa savecallback by @codemayq in #4995
Add Korean display language in LlamaBoard by @Eruly in #5010
Fix deepseekcoder template by @relic-yuexi in #5072
Fix examples by @codemayq in #5109
Fix mask_history truncate from last by @YeQiuO in #5115
Fix jinja template by @YeQiuO in #5156
Fix PPO optimizer and lr scheduler by @liu-zichen in #5163
Add SailorLLM template by @chenhuiyu in #5185
Fix XPU device count by @Zxilly in #5188
Fix bf16 check in NPU by @Ricardo-L-C in #5193
Update NPU docker image by @MengqingCao in #5230
Fix image input api by @marko1616 in #5237
Add liger-kernel link by @ByronHsu in #5317
Fix #4684 #4696 #4917 #4925 #4928 #4944 #4959 #4992 #5035 #5048 #5060 #5092 #5228 #5252 #5292 #5295 #5305 #5307 #5308 #5324 #5331 #5334 #5338 #5344 #5366 #5384

- Python
Published by hiyouga over 1 year ago

lazyllm-llamafactory - v0.8.3: Neat Packing, Split Evaluation

New features

🔥Support contamination-free packing via the neat_packing argument by @chuan298 in #4224
🔥Support split evaluation via the eval_dataset argument by @codemayq in #4691
🔥Support HQQ/EETQ quantization via the quantization_method argument by @hiyouga
🔥Support ZeRO-3 when using BAdam by @Ledzy in #4352
Support train on the last turn via the mask_history argument by @aofengdaxia in #4878
Add NPU Dockerfile by @MengqingCao in #4355
Support building FlashAttention2 in Dockerfile by @hzhaoy in #4461
Support batch_eval_metrics at evaluation by @hiyouga

New models

Base models
- InternLM2.5-7B 📄
- Gemma2 (9B/27B) 📄
Instruct/Chat models
- TeleChat-1B-Chat by @hzhaoy in #4651 📄🤖
- InternLM2.5-7B-Chat 📄🤖
- CodeGeeX4-9B-Chat 📄🤖
- Gemma2-it (9B/27B) 📄🤖

Changes

Fix DPO cutoff len and deprecate reserved_label_len argument
Improve loss function for reward modeling

Bug fix

Fix numpy version by @MengqingCao in #4382
Improve cli by @kno10 in #4409
Add tool_format parameter to control prompt by @mMrBun in #4417
Automatically label npu issue by @MengqingCao in #4445
Fix flash_attn args by @stceum in #4446
Fix docker-compose path by @MengqingCao in #4544
Fix torch-npu dependency by @hashstone in #4561
Fix deepspeed + pissa by @hzhaoy in #4580
Improve cli by @injet-zhou in #4590
Add project by @wzh1994 in #4662
Fix docstring by @hzhaoy in #4673
Fix Windows command preview in WebUI by @marko1616 in #4700
Fix vllm 0.5.1 by @T-Atlas in #4706
Fix save value head model callback by @yzoaim in #4746
Fix CUDA Dockerfile by @hzhaoy in #4781
Fix examples by @codemayq in #4804
Fix evaluation data split by @codemayq in #4821
Fix CI by @codemayq in #4822
Fix #2290 #3974 #4113 #4379 #4398 #4402 #4410 #4419 #4432 #4456 #4458 #4549 #4556 #4579 #4592 #4609 #4617 #4674 #4677 #4683 #4684 #4699 #4705 #4731 #4742 #4779 #4780 #4786 #4792 #4820 #4826

- Python
Published by hiyouga almost 2 years ago

lazyllm-llamafactory - v0.8.2: PiSSA, Parallel Functions

New features

Support GLM-4 tools and parallel function calling by @mMrBun in #4173
Support PiSSA fine-tuning by @hiyouga in #4307

New models

Base models
- DeepSeek-Coder-V2 (16B MoE/236B MoE) 📄
Instruct/Chat models
- MiniCPM-2B 📄🤖
- DeepSeek-Coder-V2-Instruct (16B MoE/236B MoE) 📄🤖

New datasets

Supervised fine-tuning datasets
- Neo-sft (zh)
- Magpie-Pro-300K-Filtered (en) by @EliMCosta in #4309
- WebInstruct (en) by @EliMCosta in #4309

Bug fix

Fix DPO+ZeRO3 problem by @hiyouga
Add MANIFEST.in by @iamthebot in #4191
Fix eos_token in llama3 pretrain by @dignfei in #4204
Fix vllm version by @kimdwkimdw and @hzhaoy in #4234 and #4246
Fix Dockerfile by @EliMCosta in #4314
Fix pandas version by @zzxzz12345 in #4334
Fix #3162 #3196 #3778 #4198 #4209 #4221 #4227 #4238 #4242 #4271 #4292 #4295 #4326 #4346 #4357 #4362

- Python
Published by hiyouga almost 2 years ago

lazyllm-llamafactory - v0.8.1: Patch release

Fix #2666: Unsloth+DoRA
Fix #4145: The PyTorch version of the docker image does not match the vLLM requirement
Fix #4160: The problem in LongLoRA implementation with the help of @f-q23
Fix #4167: The installation problem in the Windows system by @yzoaim

- Python
Published by hiyouga almost 2 years ago

lazyllm-llamafactory - v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO

Stronger LlamaBoard 💪😀

Support single-node distributed training in Web UI
Add dropdown menu for easily resuming from checkpoints and picking saved configurations by @hiyouga and @hzhaoy in #4053
Support selecting checkpoints of full/freeze tuning
Add throughput metrics to LlamaBoard by @injet-zhou in #4066
Faster UI loading

New features

Add KTO algorithm by @enji-zhou in #3785
Add SimPO algorithm by @hiyouga
Support passing max_lora_rank to the vLLM backend by @jue-jue-zi in #3794
Support preference datasets in sharegpt format and remove big files from git repo by @hiyouga in #3799
Support setting system messages in CLI inference by @ycjcl868 in #3812
Add num_samples option in dataset_info.json by @seanzhang-zhichen in #3829
Add NPU docker image by @dongdongqiang2018 in #3876
Improve NPU document by @MengqingCao in #3930
Support SFT packing with greedy knapsack algorithm by @AlongWY in #4009
Add llamafactory-cli env for bug report
Support image input in the API mode
Support random initialization via the train_from_scratch argument
Initialize CI

New models

Base models
- Qwen2 (0.5B/1.5B/7B/72B/MoE) 📄
- PaliGemma-3B (pt/mix) 📄🖼️
- GLM-4-9B 📄
- Falcon-11B 📄
- DeepSeek-V2-Lite (16B) 📄
Instruct/Chat models
- Qwen2-Instruct (0.5B/1.5B/7B/72B/MoE) 📄🤖
- Mistral-7B-Instruct-v0.3 📄🤖
- Phi-3-small-8k-instruct (7B) 📄🤖
- Aya-23 (8B/35B) 📄🤖
- OpenChat-3.6-8B 📄🤖
- GLM-4-9B-Chat 📄🤖
- TeleChat-12B-Chat by @hzhaoy in #3958 📄🤖
- Phi-3-medium-8k-instruct (14B) 📄🤖
- DeepSeek-V2-Lite-Chat (16B) 📄🤖
- Codestral-22B-v0.1 📄🤖

New datasets

Pre-training datasets
- FineWeb (en)
- FineWeb-Edu (en)
Supervised fine-tuning datasets
- Ruozhiba-GPT4 (zh)
- STEM-Instruction (zh)
Preference datasets
- Argilla-KTO-mix-15K (en)
- UltraFeedback (en)

Bug fix

Fix RLHF for multimodal finetuning
Fix LoRA target in multimodal finetuning by @BUAADreamer in #3835
Fix yi template by @Yimi81 in #3925
Fix abort issue in LlamaBoard by @injet-zhou in #3987
Pass scheduler_specific_kwargs to get_scheduler by @Uminosachi in #4006
Fix hyperparameters helps by @xu-song in #4007
Update issue template by @statelesshz in #4011
Fix vllm dtype parameter
Fix exporting hyperparameters by @MengqingCao in #4080
Fix DeepSpeed ZeRO3 in PPO trainer
Fix #3108 #3387 #3646 #3717 #3764 #3769 #3803 #3807 #3818 #3837 #3847 #3853 #3873 #3900 #3931 #3965 #3971 #3978 #3992 #4005 #4012 #4013 #4022 #4033 #4043 #4061 #4075 #4077 #4079 #4085 #4090 #4120 #4132 #4137 #4139

- Python
Published by hiyouga almost 2 years ago

lazyllm-llamafactory - v0.7.1: Ascend NPU Support, Yi-VL Models

🚨🚨 Core refactor 🚨🚨

Add CLIs usage, now we recommend using llamafactory-cli to launch training and inference, the entry point is located at the cli.py
Rename files: train_bash.py -> train.py, train_web.py -> webui.py, api_demo.py -> api.py
Remove files: cli_demo.py, evaluate.py, export_model.py, web_demo.py, use llamafactory-cli chat/eval/export/webchat instead
Use YAML configs in examples instead of shell scripts for a pretty view
Remove the sha1 hash check when loading datasets
Rename arguments: num_layer_trainable -> freeze_trainable_layers, name_module_trainable -> freeze_trainable_modules

The above changes are made by @hiyouga in #3596

REMINDER: Now installation is mandatory to use LLaMA Factory

New features

Support training and inference on the Ascend NPU 910 devices by @zhou-wjjw and @statelesshz (docker images are also provided)
Support stop parameter in vLLM engine by @zhaonx in #3527
Support fine-tuning token embeddings in freeze tuning via the freeze_extra_modules argument
Add Llama3 quickstart to readme

New models

Base models
- Yi-1.5 (6B/9B/34B) 📄
- DeepSeek-V2 (236B) 📄
Instruct/Chat models
- Yi-1.5-Chat (6B/9B/34B) 📄🤖
- Yi-VL-Chat (6B/34B) by @BUAADreamer in #3748 📄🖼️🤖
- Llama3-Chinese-Chat (8B/70B) 📄🤖
- DeepSeek-V2-Chat (236B) 📄🤖

Bug fix

Add badam arguments to LlamaBoard by @codemayq in #3487
Add openai data format to readme by @khazic in #3490
Fix slow operation in dpo/orpo trainer by @hiyouga
Fix badam examples by @pha123661 in #3578
Fix download link of the nectar_rm dataset by @ZeyuTeng96 in #3588
Add project by @Katehuuh in #3601
Fix dockerfile by @gaussian8 in #3604
Fix full tuning of MLLMs by @BUAADreamer in #3651
Fix gradio environment variables by @cocktailpeanut in #3654
Fix typo and add log in API by @Tendo33 in #3655
Fix download link of the phi-3 model by @YUUUCC in #3683
Fix #3559 #3560 #3602 #3603 #3606 #3625 #3650 #3658 #3674 #3694 #3702 #3724 #3728

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.7.0: LLaVA Multimodal LLM Support

Congratulations on 20k stars 🎉 We are the 1st of the GitHub Trending at Apr. 23rd 🔥 Follow us at X

New features

Support SFT/PPO/DPO/ORPO for the LLaVA-1.5 model by @BUAADreamer in #3450
Support inferring the LLaVA-1.5 model with both native Transformers and vLLM by @hiyouga in #3454
Support vLLM+LoRA inference for partial models (see support list)
Support 2x faster generation of the QLoRA model based on UnslothAI's optimization
Support adding new special tokens to the tokenizer via the new_special_tokens argument
Support choosing the device to merge LoRA in LlamaBoard via the export_device argument
Add a Colab notebook for getting into fine-tuning the Llama-3 model on a free T4 GPU
Automatically enable SDPA attention and fast tokenizer for higher performance

New models

Base models
- OLMo-1.7-7B
- Jamba-v0.1-51B
- Qwen1.5-110B
- DBRX-132B-Base
Instruct/Chat models
- Phi-3-mini-3.8B-instruct (4k/128k)
- LLaVA-1.5-7B
- LLaVA-1.5-13B
- Qwen1.5-110B-Chat
- DBRX-132B-Instruct

New datasets

Supervised fine-tuning datasets
- LLaVA mixed (en&zh) by @BUAADreamer in #3471
Preference datasets
- DPO mixed (en&zh) by @hiyouga

Bug fix

Fix #2093 #3333 #3347 #3374 #3387

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.6.3: Llama-3 and 3x Longer QLoRA

New features

Support Meta Llama-3 (8B/70B) models
Support UnslothAI's long-context QLoRA optimization (56,000 context length for Llama-2 7B in 24GB)
Support previewing local datasets in directories in LlamaBoard by @codemayq in #3291

New algorithms

Support BAdam algorithm by @Ledzy in #3287
Support Mixture-of-Depths training by @mlinmg in #3338

New models

Base models
- CodeGemma (2B/7B)
- CodeQwen1.5-7B
- Llama-3 (8B/70B)
- Mixtral-8x22B-v0.1
Instruct/Chat models
- CodeGemma-7B-it
- CodeQwen1.5-7B-Chat
- Llama-3-Instruct (8B/70B)
- Command R (35B) by @marko1616 in #3254
- Command R+ (104B) by @marko1616 in #3254
- Mixtral-8x22B-Instruct-v0.1

Bug fix

Fix full-tuning batch prediction examples by @khazic in #3261
Fix outputrouterlogits of Mixtral by @liu-zichen in #3276
Fix automodel from pretrained with attn implementation (see https://github.com/huggingface/transformers/issues/30298)
Fix unable to convergence issue in the layerwise galore optimizer (see https://github.com/huggingface/transformers/issues/30371)
Fix #3184 #3238 #3247 #3273 #3316 #3317 #3324 #3348 #3352 #3365 #3366

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.6.2: ORPO and Qwen1.5-32B

New features

Support ORPO algorithm by @hiyouga in #3066
Support inferring BNB 4-bit models on multiple GPUs via the quantization_device_map argument
Reorganize README files, move example scripts to the examples folder
Support saving & loading arguments quickly in LlamaBoard by @hiyouga and @marko1616 in #3046
Support load alpaca-format dataset from the hub without dataset_info.json by specifying --dataset_dir ONLINE
Add a parameter moe_aux_loss_coef to control the coefficient of auxiliary loss in MoE models.

New models

Base models
- Breeze-7B-Base
- Qwen1.5-MoE-A2.7B (14B)
- Qwen1.5-32B
Instruct/Chat models
- Breeze-7B-Instruct
- Qwen1.5-MoE-A2.7B-Chat (14B)
- Qwen1.5-32B-Chat

Bug fix

Fix pile dataset download config by @lealaxy in #3053
Fix model generation config by @marko1616 in #3057
Fix qwen1.5 models DPO training by @changingivan and @hiyouga in #3083
Support Qwen1.5-32B by @sliderSun in #3160
Support Breeze-7B by @codemayq in #3161
Fix addtional_target in unsloth by @kno10 in #3201
Fix #2807 #3022 #3023 #3046 #3077 #3085 #3116 #3200 #3225

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.6.1: Patch release

This patch mainly fixes #2983

In commit 9bec3c98a22c91b1c28fda757db51eb780291641, we built the optimizer and scheduler inside the trainers, which inadvertently introduced a bug: when DeepSpeed was enabled, the trainers in transformers would build an optimizer and scheduler before calling the create_optimizer_and_scheduler method [1], then the optimizer created by our method would overwrite the original one, while the scheduler would not. Consequently, the scheduler would no longer affect the learning rate in the optimizer, leading to a regression in the training result. We have fixed this bug in 3bcd41b639899e72bcabc51d59bac8967af19899 and 8c77b1091296e204dc3c8c1f157c288ca5b236bd. Thank @HideLord for helping us identify this critical bug.

[1] https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/trainer.py#L1877-L1881

We have also fixed #2961 #2981 #2982 #2983 #2991 #3010

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.6.0: Paper Release, GaLore and FSDP+QLoRA

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
LLaMA Factory 🤝 vLLM, enjoy 270% inference speed with --infer_backend vllm
Add Colab notebook for easily getting started
Support pushing fine-tuned models to Hugging Face Hub in web UI
Support apply_chat_template by adding a chat template to the tokenizer after fine-tuning
Add dockerize support by @S3Studio in #2743 #2849

New models

Base models
- OLMo (1B/7B)
- StarCoder2 (3B/7B/15B)
- Yi-9B
Instruct/Chat models
- OLMo-7B-Instruct

New datasets

Supervised fine-tuning datasets
- Cosmopedia (en)
Preference datasets
- Orca DPO (en)

Bug fix

Fix flash_attn in web UI by @cx2333-gt in #2730
Fix deepspeed runtime error in PPO by @stephen-nju in #2746
Fix readme ddp instruction by @khazic in #2903
Fix environment variable in datasets by @SirlyDreamer in #2905
Fix readme information by @0xez in #2919
Fix generation config validation by @marko1616 in #2945
Fix requirements by @rkinas in #2963
Fix bitsandbytes windows version by @Tsumugii24 in #2967
Fix #2346 #2642 #2649 #2732 #2735 #2756 #2766 #2775 #2777 #2782 #2798 #2802 #2803 #2817 #2895 #2928 #2936 #2941

- Python
Published by hiyouga about 2 years ago

lazyllm-llamafactory - v0.5.3: DoRA and AWQ/AQLM QLoRA

New features

Support DoRA (Weight-Decomposed LoRA)
Support QLoRA for the AWQ/AQLM quantized models, now 2-bit QLoRA is feasible
Provide some example scripts in https://github.com/hiyouga/LLaMA-Factory/tree/main/examples

New models

Base models
- Gemma (2B/7B)
Instruct/Chat models
- Gemma-it (2B/7B)

Bug fix

Add flash-attn package for Windows user by @codemayq in #2514
Fix ppo trainer #1163 by @stephen-nju in #2525
Support atom models by @Rayrtfr in #2531
Support role in webui by @lungothrin in #2575
Bump accelerate to 0.27.2 and fix #2552 by @Katehuuh in #2608
Fix #2512 #2516 #2532 #2533 #2629

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.5.2: Block expansion, Qwen1.5 models

New features

Support block expansion in LLaMA Pro, see tests/llama_pro.py for usage
Add use_rslora option for the LoRA method

New models

Base models
- Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
- DeepSeekMath-7B-Base
- DeepSeekCoder-7B-Base-v1.5
- Orion-14B-Base
Instruct/Chat models
- Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
- MiniCPM-2B-SFT/DPO
- DeepSeekMath-7B-Instruct
- DeepSeekCoder-7B-Instruct-v1.5
- Orion-14B-Chat
- Orion-14B-Long-Chat
- Orion-14B-RAG-Chat
- Orion-14B-Plugin-Chat

New datasets

Supervised fine-tuning datasets
- SlimOrca (en)
- Dolly (de)
- Dolphin (de)
- Airoboros (de)
Preference datasets
- Orca DPO (de)

Bug fix

Fix torch_dtype check in export model by @fenglui in #2262
Add Russian locale to LLaMA Board by @seoeaa in #2264
Remove manually set use_cache in export model by @yhyu13 in #2266
Fix DeepSpeed Zero3 training with MoE models by @A-Cepheus in #2283
Add a patch for full training of the Mixtral model using DeepSpeed Zero3 by @ftgreat in #2319
Fix bug in data pre-processing by @lxsyz in #2411
Add German sft and dpo datasets by @johannhartmann in #2423
Add version checking in test_toolcall.py by @mini-tiger in #2435
Enable parsing of SlimOrca dataset by @mnmueller in #2462
Add tags for models when pushing to hf hub by @younesbelkada in #2474
Fix #2189 #2268 #2282 #2320 #2338 #2376 #2388 #2394 #2397 #2404 #2412 #2420 #2421 #2436 #2438 #2471 #2481

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.5.0: Agent Tuning, Unsloth Integration

Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

Support agent tuning for most models, you can fine-tune any LLMs with --dataset glaive_toolcall for tool using #2226
Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
LLaMA Factory 🤝 Unsloth, enjoy 170% LoRA training speed with --use_unsloth, see benchmarking here
Supports fine-tuning models on MPS device #2090

New models

Base models
- Phi-2 (2.7B)
- InternLM2 (7B/20B)
- SOLAR-10.7B
- DeepseekMoE-16B-Base
- XVERSE-65B-2
Instruct/Chat models
- InternLM2-Chat (7B/20B)
- SOLAR-10.7B-Instruct
- DeepseekMoE-16B-Chat
- Yuan (2B/51B/102B)

New datasets

Supervised fine-tuning datasets
- deepctrl dataset
- Glaive function calling dataset v2

Core updates

Refactor data engine: clearer dataset alignment, easier templating and tool formatting
Refactor saving logic for models with value head #1789
Use ruff code formatter for stylish code

Bug fix

Bump transformers version to 4.36.2 by @ShaneTian in #1932
Fix requirements by @dasdristanta13 in #2117
Add Machine-Mindset project by @JessyTsu1 in #2163
Fix typo in readme file by @junuMoon in #2194
Support resize token embeddings with ZeRO3 by @liu-zichen in #2201
Fix #1073 #1462 #1617 #1735 #1742 #1789 #1821 #1875 #1895 #1900 #1908 #1907 #1909 #1923 #2014 #2067 #2081 #2090 #2098 #2125 #2127 #2147 #2161 #2164 #2183 #2195 #2249 #2260

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ integration

🚨🚨 Core refactor

Deprecate checkpoint_dir and use adapter_name_or_path instead
Replace resume_lora_training with create_new_adapter
Move the patches in model loading to llmtuner.model.patcher
Bump to Transformers 4.36.1 to adapt to the Mixtral models
Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in https://github.com/hiyouga/LLaMA-Factory/issues/1347#issuecomment-1846943606
Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b8724ffd0351a32593ab52d8a2312f339b
Support system column in both alpaca and sharegpt dataset formats

New models

Base models
- Mixtral-8x7B-v0.1
Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat

Bug fix

Improve logging for unknown arguments by @yhyu13 in #1868
Fix an overflow issue in LLaMA2 PPO training #1742
Fix #246 #1561 #1715 #1764 #1765 #1770 #1771 #1784 #1786 #1795 #1815 #1819 #1831

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.3.3: ModelScope integration, reward server

New features

Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
Support using a reward model server in PPO training via specifying --reward_model_type api
Support adjusting the shard size of exported models via the export_size argument

New models

Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat

New datasets

Supervised fine-tuning datasets
- Nectar dataset by @mlinmg in #1689
Preference datasets
- Nectar dataset by @mlinmg in #1689

Bug fix

Improve getcurrentdevice by @billvsme in #1690
Improve web UI preview by @Samge0 in #1695
Fix #1543 #1597 #1657 #1658 #1659 #1668 #1682 #1696 #1699 #1703 #1707 #1710

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.3.2: Patch release

New features

Support training GPTQ quantized model #729 #1481 #1545
Support resuming reward model training #1567

Bug fix

Change default PPO parameters by @hannlp in #1553
Fix ChatGLM2&3 templates #1453 #1480
Fix #1548 by @Outsider565 in #1544
Fix #1263 #1550 #1558

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.3.0: Full-parameter RLHF

New features

Support full-parameter RLHF training (RM & PPO)
Refactor llmtuner core in #1525 by @hiyouga
Better LLaMA Board: full-parameter RLHF and demo mode

New models

Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta

Bug fix

Fix bugs in partial-parameter (freeze) tuning
Fix #224 #336 #931 #936 #1011 #1489 #1494 #1507 #1514

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.2.2: Patch release

Bug fix

Fix the OOM issue in PPO training by @mmbwf in #424
Fix fine-tuning arguments by @yyq in #1454
Refactor constants and evaluation by @hiyouga
Fix #1452 #1466 #1478

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.2.1: Variant models, NEFTune trick

New features

Support NEFTune trick for supervised fine-tuning by @anvie in #1252
Support loading dataset in the sharegpt format - read data/readme for details
Support generating multiple responses in demo API via the n parameter
Support caching the pre-processed dataset files via the cache_path argument
Better LLaMA Board (pagination, controls, etc.)
Support push_to_hub argument #1088

New models

Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

Pre-training datasets
- RedPajama V2
- Pile
Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2

Bug fix

Fix full-parameter DPO training #1383 #1422 (inspired by @mengban )
Fix tokenizer config by @lvzii in #1436
Fix #1197 #1215 #1217 #1218 #1228 #1232 #1285 #1287 #1290 #1316 #1325 #1349 #1356 #1365 #1411 #1418 #1438 #1439 #1446

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.2.0: Refactor Web UI, support LongLoRA

New features

Support LongLoRA for the LLaMA models
Support training the Qwen-14B and InternLM-20B models
Support training state recovery for the all-in-one Web UI
Support Ascend NPU by @statelesshz in #975
Integrate MMLU, C-Eval and CMMLU benchmarks

Modifications

Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
Use the cutoff_len argument instead of max_source_length and max_target_length #944
Add a train_on_prompt option #1184

Bug fix

Fix numeric error caused by the layer norm dtype in https://github.com/hiyouga/LLaMA-Factory/commit/84b7486885c600e5e65c5ba9095d56ecc2502977 [1]
Fix bugs in PPO Trainer by @mmbwf in #900
Fix #424 #762 #814 #887 #913 #1000 #1026 #1032 #1064 #1068 #1074 #1086 #1097 #1176 #1177 #1190 #1191

[1] https://github.com/huggingface/transformers/pull/25598#discussion_r1335345914

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.1.8: FlashAttention-2 and Baichuan2

New features

Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 is required)
Support training the Baichuan2 models
Use right-padding to avoid overflow in fp16 training (also mentioned here)
Align the computation method of the reward score with DeepSpeed-Chat (better generation)
Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

Use efficient EOS tokens to align with the Baichuan training ( https://github.com/baichuan-inc/Baichuan2/issues/23 )
Remove PeftTrainer to save model checkpoints in DeepSpeed training
Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
Add dataset explanation by @panpan0000 in #629
Fix a bug in the DPO data collator
Fix a bug of the ChatGLM2 tokenizer in right-padding
#608 #617 #649 #757 #761 #763 #809 #818

- Python
Published by hiyouga over 2 years ago

lazyllm-llamafactory - v0.1.7: Script preview and RoPE scaling

New features

Preview training script in Web UI by @codemayq in #479 #511
Support resuming from checkpoints by @niuba in #434 (transformers>=4.31.0 required)
Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (transformers>=4.31.0 required)
Support training the ChatGLM2-6B model
Support PPO training in bfloat16 data type #551

Bug fix

Unusual output of quantized models #278 #391
Runtime error in distributed DPO training #480
Unexpected truncation in generation #532
Dataset streaming error in pre-training #548 #549
Tensor shape mismatch in PPO training using ChatGLM2 #527 #528
#475 #476 #478 #481 #494 #551

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.6: DPO Training and Qwen-7B

Adapt DPO training from the TRL library
Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
Implement the "safe" ChatML template for Qwen-7B-Chat
Better Web UI
Pretty readme by @codemayq #382
New features: #395 #451
Fix InternLM-7B inference #312
Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.5: Patch release

Fix LLaMA-2 template #307
Fix bug in preprocessing 968ce0dcce6bfef582ce37aea6566a65f5aac811
Fix #294 #296

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.4: Dataset streaming

Support dataset streaming
Fix LLaMA-2 #268
Fix DeepSpeed ZeRO-3 model save #274
Fix #242 #284

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.3: Patch release

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.2: LLaMA-2 models

Support LLaMA-2 (good issue #202 )
Advanced configurations in Web UI
Fix API (downgrade pydantic<2.0.0)
Fix baichuan lora hparam #194 #212
Fix padding #196
Fix ZeRO-3 #199
Allow pass args to app #213
Code simplification
Add ShareGPT dataset

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.1

Web UI: sourceprefix, maxlength, dev set
Bug fix: reward token #179
Update template #171 #177
Bug fix: replace the Literal type with Enum for pydantic [1] #176
Add Web demo #180

[1] https://github.com/pydantic/pydantic/issues/5821, https://github.com/tiangolo/sqlmodel/issues/67

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.1.0: All-in-one Web UI

Fix gradient accumulation in PPO Trainer https://github.com/hiyouga/ChatGLM-Efficient-Tuning/issues/299
All-in-one Web UI by @hiyouga , @KanadeSiina and @codemayq

- Python
Published by hiyouga almost 3 years ago

lazyllm-llamafactory - v0.0.9

- Python
Published by hiyouga almost 3 years ago

Recent Releases of lazyllm-llamafactory

lazyllm-llamafactory - v0.9.3: Llama4, Gemma3, Qwen3, InternVL3, Qwen2.5-Omni

We will attend the AWS Summit Shanghai 2025 on June 20th! See you in Shanghai 👋

New features

New models

New datasets

Bug fix

lazyllm-llamafactory - v0.9.2: MiniCPM-o, SwanLab, APOLLO

This is the last version before LLaMA-Factory v1.0.0. We are working hard to improve the efficiency and availability.

We will attend the vLLM Beijing Meetup on Mar 16th! See you in Beijing 👋

New features

New models

New datasets

Changes

Bug fix

lazyllm-llamafactory - v0.9.1: Many Vision Models, Qwen2.5 Coder, Gradient Fix

New features

New models

Security fix

Bug fix

lazyllm-llamafactory - v0.9.0: Qwen2-VL, Liger-Kernel, Adam-mini

Congratulations on 30,000 stars 🎉 Follow us at X (twitter)

New features

New models

New datasets

Changes

Bug fix

lazyllm-llamafactory - v0.8.3: Neat Packing, Split Evaluation

New features

New models

Changes

Bug fix

lazyllm-llamafactory - v0.8.2: PiSSA, Parallel Functions

New features

New models

New datasets

Bug fix

lazyllm-llamafactory - v0.8.1: Patch release

lazyllm-llamafactory - v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO

Stronger LlamaBoard 💪😀

New features

New models

New datasets

Bug fix

lazyllm-llamafactory - v0.7.1: Ascend NPU Support, Yi-VL Models

🚨🚨 Core refactor 🚨🚨

REMINDER: Now installation is mandatory to use LLaMA Factory

New features

New models

Bug fix

lazyllm-llamafactory - v0.7.0: LLaVA Multimodal LLM Support

Congratulations on 20k stars 🎉 We are the 1st of the GitHub Trending at Apr. 23rd 🔥 Follow us at X

New features

New models

New datasets

Bug fix

lazyllm-llamafactory - v0.6.3: Llama-3 and 3x Longer QLoRA

New features

New algorithms

New models

Bug fix

lazyllm-llamafactory - v0.6.2: ORPO and Qwen1.5-32B

New features

New models

Bug fix

lazyllm-llamafactory - v0.6.1: Patch release

lazyllm-llamafactory - v0.6.0: Paper Release, GaLore and FSDP+QLoRA

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

New models

New datasets

Bug fix

lazyllm-llamafactory - v0.5.3: DoRA and AWQ/AQLM QLoRA

New features

New models

Bug fix

lazyllm-llamafactory - v0.5.2: Block expansion, Qwen1.5 models

New features

New models

New datasets