Recent Releases of lazyllm-llamafactory
lazyllm-llamafactory - v0.9.3: Llama4, Gemma3, Qwen3, InternVL3, Qwen2.5-Omni
We will attend the AWS Summit Shanghai 2025 on June 20th! See you in Shanghai π
- Event info: https://aws.amazon.com/cn/events/summits/shanghai/
New features
- π₯ InternVL2.5/InternVL3 model by @Kuangdd01 in #7258
- π₯ Qwen2.5-Omni model by @Kuangdd01 in #7537
- π₯ Llama 4 and Gemma 3 multimodal model by @hiyouga in #7273 and #7611
- π₯ Official GPU docker image by @yzoaim in #8181
- π₯ SGLang inference by @Qiaolin-Yu and @jhinpan in #7278
- GLM-4-0414 and GLM-Z1 model by @zRzRzRzRzRzRzR in #7695
- Kimi-VL model by @Kuangdd01 in #7719
- Qwen3 model by @hiyouga in #7885
- MiMo and MiMo-VL model by @Kuangdd01 in #7946 #8249
- SmolLM/SmolLM2 model by @akshatsehgal in #8050 #8220
- MiniCPM4 model by @LDLINGLINGLING in #8314
- Mistral-Small-3.1 model by @Kuangdd01 in #8335
- Add
scripts/eval_bleu_rouge.pyby @SnowFox4004 in #7419 - Add Muon optimizer by @tianshijing in #7749
- Support video/audio inference with vLLM by @hiyouga in #7566
- Support S3/GCS cloud data by @erictang000 in #7567
- Support vLLM-ascend by @leo-pony in #7739
- Support OmegaConf by @hiyouga in #7793
- Support early-stopping by @hiyouga in #7797
- Add
enable_thinkingargument for reasoning models by @hiyouga in #7928 - PyTorch-elastic and fault-tolerant launch by @hubutui in #8286
- Length Desensitization DPO (LD-DPO) by @amangup in #8362
New models
- Base models
- SmolLM/SmolLM2 (135M/360M/1.7B) π
- Qwen3 Base (0.6B/1.7B/4B/8B/14B/30B) π
- Gemma 3 (1B/4B/12B/27B) ππΌοΈ
- MedGemma (4B) ππ©Ί
- MiMo Base (7B) π
- Seed-Coder Base (8B) πβ¨οΈ
- Mistral-Small-3.1 Base (24B) ππΌοΈ
- GLM-4-0414 Base (32B) π
- Llama 4 (109B/492B) ππΌοΈ
- Instruct/Chat models
- SmolLM/SmolLM2 Instruct (135M/360M/1.7B) ππ€
- MiniCPM4 (0.5B/8B) ππ€
- Qwen3 (0.6B/1.7B/4B/8B/14B/32B/30B/235B) ππ€π§
- Gemma 3 Instruct (1B/4B/12B/27B) ππ€πΌοΈ
- InternVL2.5/3 Instruct/MPO (1B/2B/8B/14B/38B/78B) ππ€πΌοΈ
- Qwen2.5-Omni (3B/7B) ππ€πΌοΈπ
- MedGemma Instruct (4B/27B) ππ€π©Ί
- MiMo SFT/RL (7B) ππ€
- MiMo-VL SFT/RL (7B) ππ€πΌοΈ
- Hunyuan Instruct (7B) ππ€
- Seed-Coder Instruct/Reasoning (8B) ππ€π§ β¨οΈ
- GLM-4-0414/GLM-Z1 Instruct (9B/32B) ππ€π§
- DeepSeek-R1-0528 (8B/671B) ππ€π§
- Kimi-VL Instruct/Thinking (17B) ππ€π§ πΌοΈ
- Mistral-Small-3.1 Instruct (24B) ππ€πΌοΈ
- Qwen2.5-VL Instruct (32B) ππ€πΌοΈ
- Llama 4 Instruct (109B/492B) ππ€πΌοΈ
New datasets
- Preference datasets
- COIG-P (zh) π
Bug fix
- Fix add new tokens by @flashJd in #7253
- Fix ultrachat_200k dataset by @felladrin in #7259
- Add efficient 4D attention mask for neat packing by @BlackWingedKing in #7272
- Fix WSD lr scheduler by @x22x22 in #7304
- Fix position ids in neat packing by @BlackWingedKing in #7318
- Fix proxy setting in webui by @taoharry in #7332
- Improve entrypoint by @ENg-122 in #7345
- Fix ray destroy process group by @erictang000 in #7395
- Fix SGLang dependencies by @guoquan in #7432
- Upgrade docker package version by @rumichi2210 in #7442
- Update liger kernel for qwen2.5-vl by @xiaosu-zhu in #7453
- Fix lora on quant models by @GuoCoder in #7456
- Enable liger kernel for gemma3 by @kennylam777 in #7462
- Enable liger kernel for paligemma by @eljandoubi in #7466
- Add Swanlab lark notification by @Xu-pixel in #7481
- Fix gemma3 use cache attribute by @ysjprojects in #7500
- Fix pixtral plugin by @Kuangdd01 in #7505
- Fix KTO mismatch pair strategy by @himalalps in #7509
- Support
dataset_shardsby @aliencaocao in #7530 - Fix qwen2.5omni plugin by @Kuangdd01 in #7573 #7578 #7883
- Fix ppo trainer by @gechengze in #7576
- Fix workflow by @Shawn-Tao in #7635
- Support qwen2.5omni audio+video2text by @Kuangdd01 in #7638
- Upgrade deps for SGLang by @adarshxs in #7639
- Allow ray env setting by @erictang000 in #7647
- Fix CUDA warning on intel xpus by @jilongW in #7655
- Fix liger kernel patch by @danny980521 in #7660
- Fix rocm dockerfile by @fluidnumerics-joe in #7725
- Fix qwen2vl with neat packing by @GeoffreyChen777 in #7754
- Fix a constant by @AlphaBladez in #7765
- Fix autogptq for Gemma by @ddddng in #7786
- Fix internvl models by @Kuangdd01 in #7801 #7803 #7817 #8129
- Fix DeepSpeed ZeRO3 on moe models by @hiyouga in #7826 #7879
- Fix gradient checkpoint func for vit by @hiyouga in #7830
- Support S3 ray storage by @erictang000 in #7854
- Fix Kimi-VL attention by @Kuangdd01 in #7867
- Fix minicpm-o vllm inference by @hiyouga in #7870
- Unfreeze muiltimodal projector in freeze training by @zhaop-l in #7872
- Fix Qwen2.5-omni plugin by @hiyouga in #7875 #7962
- Add warp support link by @ericdachen in #7887
- Replace eos token for base model by @hiyouga in #7911
- Add
eval_on_each_datasetarg by @hiyouga in #7912 - Fix qwen3 loss by @hiyouga in #7923 #8109
- Add repetition_penalty to api by @wangzhanxd in #7958
- Add graphgen to readme by @tpoisonooo in #7974
- Support video params in vllm batch infer by @Kuangdd01 in #7992
- Fix tool formatter by @yunhao-tech in #8000
- Fix kimi vl plugin by @hiyouga in #8015
- Support batch preprocess in vllm batch infer by @Shawn-Tao in #8051
- Support loading remote folder by @erictang000 in #8078
- Fix video utils import by @Kuangdd01 in #8077
- Fix SGLang LoRA inference by @Kiko-RWan in #8067
- Fix cli by @Wangbiao2 in #8095
- Fix pretrain workflow by @SunnyHaze in #8099
- Fix rope args for yarn by @piamo in #8101
- Add no build isolation in installing by @hiyouga in #8103
- Switch to GPTQModel and deprecate AutoGPTQ by @hiyouga in #8108
- Support llama3 parallel function call by @hiyouga in #8124
- Add
data_shared_file_systemby @hiyouga in #8179 - Fix load remote files by @youngwookim in #8183
- Fix dataset info by @Muqi1029 in #8197
- Fix qwen2.5 omni merge script by @Kuangdd01 in #8227 #8293
- Add unittest for VLM save load by @Kuangdd01 in #8248
- Add tag in swanlab by @Zeyi-Lin in #8258
- Support input video frames by @Kuangdd01 in #8264
- Fix empty template by @hiyouga in #8312
- Support full-finetuning with unsloth by @Remorax in #8325
- Add awesome work by @MING-ZCH in #8333
- Release v0.9.3 by @hiyouga in #8386
- Fix qwen2vl position ids by @hiyouga in #8387
- Fix vlm utils by @hiyouga in #8388
- Fix #3802 #4443 #5548 #6236 #6322 #6432 #6708 #6739 #6881 #6919 #7080 #7105 #7119 #7225 #7267 #7327 #7389 #7416 #7427 #7428 #7443 #7447 #7454 #7490 #7501 #7502 #7513 #7520 #7541 #7545 #7552 #7563 #7598 #7600 #7613 #7636 #7678 #7680 #7687 #7688 #7730 #7743 #7772 #7791 #7800 #7816 #7829 #7845 #7865 #7874 #7889 #7905 #7906 #7907 #7909 #7916 #7918 #7919 #7939 #7953 #7965 #7990 #8008 #8056 #8061 #8066 #8069 #8087 #8091 #8092 #8096 #8097 #8111 #8119 #8147 #8166 #8169 #8174 #8182 #8189 #8223 #8241 #8247 #8253 #8294 #8309 #8324 #8326 #8332
Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.2...v0.9.3
- Python
Published by hiyouga 12 months ago
lazyllm-llamafactory - v0.9.2: MiniCPM-o, SwanLab, APOLLO
This is the last version before LLaMA-Factory v1.0.0. We are working hard to improve the efficiency and availability.
We will attend the vLLM Beijing Meetup on Mar 16th! See you in Beijing π
- Event info: https://mp.weixin.qq.com/s/viPRDlhnzS3qO9-96fMeeA
New features
- π₯ APOLLO optimizer by @zhuhanqing in #6617
- π₯ SwanLab experiment tracker by @Zeyi-Lin in #6401
- π₯ Ray Trainer by @erictang000 in #6542
- Batch inference with vLLM TP by @JieShenAI in #6190
- QLoRA on Ascend NPU by @codemayq in #6601
- Yarn and Llama3 rope scaling by @hiyouga in #6693
- Support
uv runby @erictang000 in #6907 - Ollama modelfile auto-generation by @codemayq in #4686
- Mistral tool prompt by @AlongWY in #5473
- Llama3 and Qwen2 tool prompt by @hiyouga in #6367 and #6369
New models
- Base models
- GPT2 (0.1B/0.4B/0.8B/1.5B) π
- Granite 3.0-3.1 (1B/2B/3B/8B) π
- PaliGemma2 (3B/10B/28B) ππΌοΈ
- Moonlight (16B) π
- DeepSeek V2-V2.5 Base (236B) π
- DeepSeek V3 Base (671B) π
- Instruct/Chat models
- Granite 3.0-3.1 (1B/2B/3B/8B) by @Tuyohai in #5922 ππ€
- DeepSeek R1 (1.5B/7B/8B/14B/32B/70B/671B) by @Qwtdgh in #6767 ππ€
- TeleChat2 (3B/7B/12B/35B/115B) @ge-xing in #6313 ππ€
- Qwen2.5-VL (3B/7B/72B) by @hiyouga in #6779 ππ€πΌοΈ
- PaliGemma2-mix (3B/10B/28B) by @Kuangdd01 in #7060 ππ€πΌοΈ
- Qwen2 Audio (7B) by @BUAADreamer in #6701 ππ€π
- MiniCPM-V/MiniCPM-o (8B) by @BUAADreamer in #6598 and #6631 ππ€πΌοΈπ
- InternLM3-Instruct (8B) by @hhaAndroid in #6640 ππ€
- Marco-o1 (8B) ππ€
- Skywork-o1 (8B) ππ€
- Phi-4 (14B) ππ€
- Moonlight Instruct (16B) π
- Mistral Small (24B) ππ€
- QwQ (32B) ππ€
- Llama-3.3-Instruct (70B) ππ€
- QvQ (72B) ππ€πΌοΈ
- DeepSeek V2-V2.5 (236B) ππ€
- DeepSeek V3 (671B) ππ€
New datasets
- Supervised fine-tuning datasets
- OpenO1 (en) π
- Open Thoughts (en) π
- Open-R1-Math (en) π
- Chinese-DeepSeek-R1-Distill (zh) π
Changes
- Refactor VLMs register by @hiyouga in #6600
- Refactor mm plugin by @hiyouga in #6895
- Refactor template by @hiyouga in #6896
- Refactor data pipeline by @hiyouga in #6901
- Update vlm arguments by @hiyouga in #6976
- We have cleaned large files in git history using BFG Repo-Cleaner, find the backup repo here
Bug fix
- Add
trust_remote_codeoption by @yafshar in #5819 - Fix mllama config by @hiyouga in #6137 and #6140
- Fix mllama pad by @hiyouga in #6151 and #6874
- Pin tokenizers version by @hiyouga in #6157
- Fix tokenized data loading by @village-way in #6160
- Show hostname in webui by @hykilpikonna in #6170
- Fix VLMs zero3 training by @hiyouga in #6233
- Add
skip_special_tokensby @hiyouga in #6363 - Support non-reenterent-gc by @hiyouga in #6364
- Add
disable_shufflingoption by @hiyouga in #6388 - Fix gen kwargs by @hiyouga in #6395
- Enable module run by @youkaichao in #6457
- Fix eval loss value by @hiyouga in #6465
- Fix paligemma inference by @hiyouga in #6483
- Add deepseek v3 template by @piamo in #5507
- Add http proxy argument in dockerfile by @shibingli in #6462
- Fix trainer generate by @hiyouga in #6512
- Fix pixtral DPO training by @hiyouga in #6547
- Fix ray args by @stephen-nju in #6564
- Fix minicpm template by @BUAADreamer in #6620
- Fix stop tokens for visual detection by @hiyouga in #6624
- Pin vllm version by @hiyouga in #6629
- Fix mllama any image by @hiyouga in #6637 and #7053
- Fix tokenizer max length by @xiaosu-zhu in #6632
- Fix webui locale by @steveepreston in #6653
- Fix MiniCPM-o DPO training by @BUAADreamer in #6657
- Fix Qwen2 MoE training by @hiyouga in #6684
- Upgrade to gradio 5 by @hiyouga in #6688
- Support Japanese local file by @engchina in #6698
- Fix DPO loss by @yinpu in #6722
- Webui thinking mode by @hiyouga in #6778
- Upgrade to transformers 4.48 by @hiyouga in #6628
- Fix ci by @hiyouga in #6787
- Fix instructions about installing fa2 on win platform in readme by @neavo in #6788
- Fix minicpmv plugin by @BUAADreamer in #6801, #6890, #6946 and #6998
- Fix qwen2 tool prompt by @yueqis in #6796
- Fix llama pro by @hiyouga in #6814
- Allow thought in function call by @yueqis in #6797
- Add
ALLOW_EXTRA_ARGSby @hiyouga in #6831 - Fix Qwen2vl plugin by @hiyouga in #6855
- Upgrade vllm to 0.7.2 by @hiyouga in #6857
- Fix unit test for tool using by @hiyouga in #6865
- Skip broken data in sharegpt converter by @JJJYmmm in #6879
- Fix qwen2.5 plugin for video by @JJJYmmm in #6868
- Parsing chat template from tokenizer by @hiyouga in #6905 (experimental)
- Fix mllama KTO training by @marko1616 in #6904
- Fix grad checkpointing by @hiyouga in #6916 and #6931
- Fix ollama template by @hiyouga in #6902
- Fix ray example by @erictang000 in #6906
- Improve error handling for media by @noahc1510 in #6128
- Support split on each dataset by @SrWYG in #5522
- Fix gen kwargs in training by @aliencaocao in #5451
- Liger kernel for qwen2.5vl by @hiyouga in #6930
- Fix lora target modules by @hiyouga in #6944
- Add
ray_storage_pathby @erictang000 in #6920 - Fix trainer.predict by @hiyouga in #6972
- Add min resolution control by @hiyouga in #6975
- Upgrade transformers to 4.49 by @hiyouga in #6982
- Add seed in vllm batch predict by @JieShenAI in #7058
- Fix pyproject.toml by @hiyouga in #7067
- Upgrade CANN images by @leo-pony in #7061
- Display swanlab link by @Zeyi-Lin in #7089
- Fix hf engine by @hiyouga in #7120
- Add bailing chat template by @oldstree in #7117
- Use bicubic resampler instead of nearest by @hiyouga in #7143
- Fix Qwen2Audio plugin by @lsrami in #7166
- Destroy process group by @hiyouga in #7174
- Fix swanlab callback by @Zeyi-Lin in #7176
- Fix paligemma plugin by @hiyouga in #7181
- Escape html tag in webui by @hiyouga in #7190
- Upgrade vllm to 0.7.3 by @hiyouga in #7183 and #7193
- Fix parser by @hiyouga in #7204
- Fix function formatter by @zhangch-ss in #7201
- Fix deepspeed config by @hiyouga in #7205
- Fix dataloader by @hiyouga in #7207
- Fix export tokenizer by @hiyouga in #7230
- Update arguments by @hiyouga in #7231
- Add
swanlab_logdirby @Zeyi-Lin in #7219 - Fix vllm batch prediction by @hiyouga in #7235
- Avoid exit after saving tokenized data by @hiyouga in #7244
- Support commit in env by @hiyouga in #7247
- Release v0.9.2 by @hiyouga in #7242
- Fix #1204 #3306 #3462 #5121 #5270 #5404 #5444 #5472 #5518 #5616 #5712 #5714 #5756 #5944 #5986 #6020 #6056 #6092 #6136 #6139 #6149 #6165 #6213 #6287 #6320 #6345 #6345 #6346 #6348 #6358 #6362 #6391 #6415 #6439 #6448 #6452 #6482 #6499 #6543 #6546 #6551 #6552 #6610 #6612 #6636 #6639 #6662 #6669 #6738 #6772 #6776 #6780 #6782 #6793 #6806 #6812 #6819 #6826 #6833 #6839 #6850 #6854 #6860 #6878 #6885 #6889 #6937 #6948 #6952 #6960 #6966 #6973 #6981 #7036 #7064 #7072 #7116 #7125 #7130 #7171 #7173 #7180 #7182 #7184 #7192 #7198 #7213 #7234 #7243
Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.1...v0.9.2
- Python
Published by hiyouga about 1 year ago
lazyllm-llamafactory - v0.9.1: Many Vision Models, Qwen2.5 Coder, Gradient Fix
New features
- π₯Support Llama-3.2 and Llama-3.2-Vision by @marko1616 in #5547 and #5555
- π₯Support LLaVA-NeXT, LLaVA-NeXT-Video and Video-LLaVA by @BUAADreamer in #5574
- π₯Support Pixtral model by @Kuangdd01 in #5581
- Support EXAONE3.0 by @shing100 in #5585
- Support Index-series models by @Cuiyn in #5910
- Support Liger-Kernel for Qwen2-VL by @aliencaocao in #5438
- Support download models from ModelHub by @huniu20 in #5642
- Fix abnormal loss values in transformers 4.46 by @hiyouga in #5852 #5871
- Support multi-image inference by @hiyouga in #5895
- Support calculating effective tokens for SFT and DPO by @wtmlon in #6078
Note: now you can install transformers>=4.46.0,<=4.46.1 to make the gradient accumulation fix enabled.
New models
- Base models
- Qwen2.5 (0.5B/1.5B/3B/7B/14B/32B/72B) π
- Qwen2.5-Coder (0.5B/1.5B/3B/7B/14B/32B) ππ₯οΈ
- Llama-3.2 (1B/3B) π
- OpenCoder (1.5B/8B) ππ₯οΈ
- Index (1.9B) π
- Instruct/Chat models
- Qwen2.5-Instruct (0.5B/1.5B/3B/7B/14B/32B/72B) ππ€
- Qwen2.5-Coder-Instruct (0.5B/1.5B/3B/7B/14B/32B) ππ€π₯οΈ
- Llama-3.2-Instruct (1B/3B) ππ€
- OpenCoder-Instruct (1.5B/8B) ππ€π₯οΈ
- Index-Chat (1.9B) ππ€
- LLaVA-NeXT (7B/8B/13B/34B/72B/110B) ππ€πΌοΈ
- LLaVA-NeXT-Video (7B/34B) ππ€πΌοΈ
- Video-LLaVA (7B) ππ€πΌοΈ
- Pixtral (12B) ππ€πΌοΈ
- EXAONE-3.0-Instruct (8B) ππ€
Security fix
- Fix CVE-2024-52803 by @superboy-zjc in aa6a174d6822340022433c5ba38182b4932adecb
Bug fix
- Update version of rocm docker by @HardAndHeavy in #5427
- Fix Phi-3-small template by @menibrief in #5475
- Fix function call dataset process function by @whybeyoung in #5483
- Add docker args by @StrangeBytesDev in #5533
- Fix logger by @chengchengpei in #5546
- Fix Gemma2 flash attention warning by @amrear in #5580
- Update setup by @johnnynunez in #5615 #5665
- Add project by @NLPJCL in #5801
- Fix saving Qwen2-VL processor by @hiyouga in #5857
- Support change base image in dockerfile by @sd3ntato in #5880
- Fix template replace behaviour by @hiyouga in #5907
- Add
image_dirargument by @hiyouga in #5909 - Add rank0 logger by @hiyouga in #5912
- Fix DPO metrics by @hiyouga in #5913 #6052
- Update datasets version by @hiyouga in #5926
- Fix chat engines by @hiyouga in #5927
- Fix vllm 0.6.3 by @hiyouga in #5970
- Fix extra args in llamaboard by @hiyouga in #5971
- Fix vllm input args by @JJJJerry in #5973
- Add
vllm_configargs by @hiyouga in #5982 #5990 - Add shm_size in docker compose config by @XYZliang in #6010
- Fix tyro version by @hiyouga in #6065
- Fix ci by @hiyouga in #6120
- Fix Qwen2-VL inference on vLLM by @hiyouga in #6123 #6126
- Release v0.9.1 by @hiyouga in #6124
- Fix #3881 #4712 #5411 #5542 #5549 #5611 #5668 #5705 #5747 #5749 #5768 #5796 #5797 #5883 #5904 #5966 #5988 #6050 #6061
Full Changelog: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.0...v0.9.1
- Python
Published by hiyouga over 1 year ago
lazyllm-llamafactory - v0.9.0: Qwen2-VL, Liger-Kernel, Adam-mini
Congratulations on 30,000 stars π Follow us at X (twitter)
New features
- π₯Support fine-tuning Qwen2-VL model on multi-image datasets by @simonJJJ in #5290
- π₯Support time&memory-efficient Liger-Kernel via the
enable_liger_kernelargument by @hiyouga - π₯Support memory-efficient Adam-mini optimizer via the
use_adam_miniargument by @relic-yuexi in #5095 - Support fine-tuning Qwen2-VL model on video datasets by @hiyouga in #5365 and @BUAADreamer in #4136 (needs patch https://github.com/huggingface/transformers/pull/33307)
- Support fine-tuning vision language models (VLMs) using RLHF/DPO/ORPO/SimPO approaches by @hiyouga
- Support Unsloth's asynchronous activation offloading method via the
use_unsloth_gcargument - Support vLLM 0.6.0 version
- Support MFU calculation by @yzoaim in #5388
New models
- Base models
- Qwen2-Math (1.5B/7B/72B) ππ’
- Yi-Coder (1.5B/9B) π
- InternLM2.5 (1.8B/7B/20B) π
- Gemma-2-2B π
- Meta-Llama-3.1 (8B/70B) π
- Instruct/Chat models
- MiniCPM/MiniCPM3 (1B/2B/4B) by @LDLINGLINGLING in #4996 #5372 ππ€
- Qwen2-Math-Instruct (1.5B/7B/72B) ππ€π’
- Yi-Coder-Chat (1.5B/9B) ππ€
- InternLM2.5-Chat (1.8B/7B/20B) ππ€
- Qwen2-VL-Instruct (2B/7B) ππ€πΌοΈ
- Gemma-2-2B-it by @codemayq in #5037 ππ€
- Meta-Llama-3.1-Instruct (8B/70B) ππ€
- Mistral-Nemo-Instruct (12B) ππ€
New datasets
- Supervised fine-tuning datasets
- Magpie-ultra-v0.1 (en) π
- Pokemon-gpt4o-captions (en&zh) ππΌοΈ
- Preference datasets
- RLHF-V (en) ππΌοΈ
- VLFeedback (en) ππΌοΈ
Changes
- Due to compatibility consideration, fine-tuning vision language models (VLMs) requires
transformers>=4.35.0.dev0, trypip install git+https://github.com/huggingface/transformers.gitto install it. visual_inputshas been deprecated, now you do not need to specify this argument.- LlamaFactory now adopts lazy loading for multimodal inputs, see #5346 for details. Please use
preprocessing_batch_sizeto restrict the batch size in dataset pre-processing (supported by @naem1023 in #5323 ). - LlamaFactory now supports
lmf(equivalent tollamafactory-cli) as a shortcut command.
Bug fix
- Fix LlamaBoard export by @liuwwang in #4950
- Add ROCm dockerfiles by @HardAndHeavy in #4970
- Fix deepseek template by @piamo in #4892
- Fix pissa savecallback by @codemayq in #4995
- Add Korean display language in LlamaBoard by @Eruly in #5010
- Fix deepseekcoder template by @relic-yuexi in #5072
- Fix examples by @codemayq in #5109
- Fix
mask_historytruncate from last by @YeQiuO in #5115 - Fix jinja template by @YeQiuO in #5156
- Fix PPO optimizer and lr scheduler by @liu-zichen in #5163
- Add SailorLLM template by @chenhuiyu in #5185
- Fix XPU device count by @Zxilly in #5188
- Fix bf16 check in NPU by @Ricardo-L-C in #5193
- Update NPU docker image by @MengqingCao in #5230
- Fix image input api by @marko1616 in #5237
- Add liger-kernel link by @ByronHsu in #5317
- Fix #4684 #4696 #4917 #4925 #4928 #4944 #4959 #4992 #5035 #5048 #5060 #5092 #5228 #5252 #5292 #5295 #5305 #5307 #5308 #5324 #5331 #5334 #5338 #5344 #5366 #5384
- Python
Published by hiyouga over 1 year ago
lazyllm-llamafactory - v0.8.3: Neat Packing, Split Evaluation
New features
- π₯Support contamination-free packing via the
neat_packingargument by @chuan298 in #4224 - π₯Support split evaluation via the
eval_datasetargument by @codemayq in #4691 - π₯Support HQQ/EETQ quantization via the
quantization_methodargument by @hiyouga - π₯Support ZeRO-3 when using BAdam by @Ledzy in #4352
- Support train on the last turn via the
mask_historyargument by @aofengdaxia in #4878 - Add NPU Dockerfile by @MengqingCao in #4355
- Support building FlashAttention2 in Dockerfile by @hzhaoy in #4461
- Support
batch_eval_metricsat evaluation by @hiyouga
New models
- Base models
- InternLM2.5-7B π
- Gemma2 (9B/27B) π
- Instruct/Chat models
- TeleChat-1B-Chat by @hzhaoy in #4651 ππ€
- InternLM2.5-7B-Chat ππ€
- CodeGeeX4-9B-Chat ππ€
- Gemma2-it (9B/27B) ππ€
Changes
- Fix DPO cutoff len and deprecate
reserved_label_lenargument - Improve loss function for reward modeling
Bug fix
- Fix numpy version by @MengqingCao in #4382
- Improve cli by @kno10 in #4409
- Add
tool_formatparameter to control prompt by @mMrBun in #4417 - Automatically label npu issue by @MengqingCao in #4445
- Fix flash_attn args by @stceum in #4446
- Fix docker-compose path by @MengqingCao in #4544
- Fix torch-npu dependency by @hashstone in #4561
- Fix deepspeed + pissa by @hzhaoy in #4580
- Improve cli by @injet-zhou in #4590
- Add project by @wzh1994 in #4662
- Fix docstring by @hzhaoy in #4673
- Fix Windows command preview in WebUI by @marko1616 in #4700
- Fix vllm 0.5.1 by @T-Atlas in #4706
- Fix save value head model callback by @yzoaim in #4746
- Fix CUDA Dockerfile by @hzhaoy in #4781
- Fix examples by @codemayq in #4804
- Fix evaluation data split by @codemayq in #4821
- Fix CI by @codemayq in #4822
- Fix #2290 #3974 #4113 #4379 #4398 #4402 #4410 #4419 #4432 #4456 #4458 #4549 #4556 #4579 #4592 #4609 #4617 #4674 #4677 #4683 #4684 #4699 #4705 #4731 #4742 #4779 #4780 #4786 #4792 #4820 #4826
- Python
Published by hiyouga almost 2 years ago
lazyllm-llamafactory - v0.8.2: PiSSA, Parallel Functions
New features
- Support GLM-4 tools and parallel function calling by @mMrBun in #4173
- Support PiSSA fine-tuning by @hiyouga in #4307
New models
- Base models
- DeepSeek-Coder-V2 (16B MoE/236B MoE) π
- Instruct/Chat models
- MiniCPM-2B ππ€
- DeepSeek-Coder-V2-Instruct (16B MoE/236B MoE) ππ€
New datasets
- Supervised fine-tuning datasets
- Neo-sft (zh)
- Magpie-Pro-300K-Filtered (en) by @EliMCosta in #4309
- WebInstruct (en) by @EliMCosta in #4309
Bug fix
- Fix DPO+ZeRO3 problem by @hiyouga
- Add MANIFEST.in by @iamthebot in #4191
- Fix eos_token in llama3 pretrain by @dignfei in #4204
- Fix vllm version by @kimdwkimdw and @hzhaoy in #4234 and #4246
- Fix Dockerfile by @EliMCosta in #4314
- Fix pandas version by @zzxzz12345 in #4334
- Fix #3162 #3196 #3778 #4198 #4209 #4221 #4227 #4238 #4242 #4271 #4292 #4295 #4326 #4346 #4357 #4362
- Python
Published by hiyouga almost 2 years ago
lazyllm-llamafactory - v0.8.1: Patch release
- Fix #2666: Unsloth+DoRA
- Fix #4145: The PyTorch version of the docker image does not match the vLLM requirement
- Fix #4160: The problem in LongLoRA implementation with the help of @f-q23
- Fix #4167: The installation problem in the Windows system by @yzoaim
- Python
Published by hiyouga almost 2 years ago
lazyllm-llamafactory - v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO
Stronger LlamaBoard πͺπ
- Support single-node distributed training in Web UI
- Add dropdown menu for easily resuming from checkpoints and picking saved configurations by @hiyouga and @hzhaoy in #4053
- Support selecting checkpoints of full/freeze tuning
- Add throughput metrics to LlamaBoard by @injet-zhou in #4066
- Faster UI loading
New features
- Add KTO algorithm by @enji-zhou in #3785
- Add SimPO algorithm by @hiyouga
- Support passing
max_lora_rankto the vLLM backend by @jue-jue-zi in #3794 - Support preference datasets in sharegpt format and remove big files from git repo by @hiyouga in #3799
- Support setting system messages in CLI inference by @ycjcl868 in #3812
- Add
num_samplesoption indataset_info.jsonby @seanzhang-zhichen in #3829 - Add NPU docker image by @dongdongqiang2018 in #3876
- Improve NPU document by @MengqingCao in #3930
- Support SFT packing with greedy knapsack algorithm by @AlongWY in #4009
- Add
llamafactory-cli envfor bug report - Support image input in the API mode
- Support random initialization via the
train_from_scratchargument - Initialize CI
New models
- Base models
- Qwen2 (0.5B/1.5B/7B/72B/MoE) π
- PaliGemma-3B (pt/mix) ππΌοΈ
- GLM-4-9B π
- Falcon-11B π
- DeepSeek-V2-Lite (16B) π
- Instruct/Chat models
- Qwen2-Instruct (0.5B/1.5B/7B/72B/MoE) ππ€
- Mistral-7B-Instruct-v0.3 ππ€
- Phi-3-small-8k-instruct (7B) ππ€
- Aya-23 (8B/35B) ππ€
- OpenChat-3.6-8B ππ€
- GLM-4-9B-Chat ππ€
- TeleChat-12B-Chat by @hzhaoy in #3958 ππ€
- Phi-3-medium-8k-instruct (14B) ππ€
- DeepSeek-V2-Lite-Chat (16B) ππ€
- Codestral-22B-v0.1 ππ€
New datasets
- Pre-training datasets
- FineWeb (en)
- FineWeb-Edu (en)
- Supervised fine-tuning datasets
- Ruozhiba-GPT4 (zh)
- STEM-Instruction (zh)
- Preference datasets
- Argilla-KTO-mix-15K (en)
- UltraFeedback (en)
Bug fix
- Fix RLHF for multimodal finetuning
- Fix LoRA target in multimodal finetuning by @BUAADreamer in #3835
- Fix
yitemplate by @Yimi81 in #3925 - Fix abort issue in LlamaBoard by @injet-zhou in #3987
- Pass
scheduler_specific_kwargstoget_schedulerby @Uminosachi in #4006 - Fix hyperparameters helps by @xu-song in #4007
- Update issue template by @statelesshz in #4011
- Fix vllm dtype parameter
- Fix exporting hyperparameters by @MengqingCao in #4080
- Fix DeepSpeed ZeRO3 in PPO trainer
- Fix #3108 #3387 #3646 #3717 #3764 #3769 #3803 #3807 #3818 #3837 #3847 #3853 #3873 #3900 #3931 #3965 #3971 #3978 #3992 #4005 #4012 #4013 #4022 #4033 #4043 #4061 #4075 #4077 #4079 #4085 #4090 #4120 #4132 #4137 #4139
- Python
Published by hiyouga almost 2 years ago
lazyllm-llamafactory - v0.7.1: Ascend NPU Support, Yi-VL Models
π¨π¨ Core refactor π¨π¨
- Add CLIs usage, now we recommend using
llamafactory-clito launch training and inference, the entry point is located at the cli.py - Rename files:
train_bash.py->train.py,train_web.py->webui.py,api_demo.py->api.py - Remove files:
cli_demo.py,evaluate.py,export_model.py,web_demo.py, usellamafactory-cli chat/eval/export/webchatinstead - Use YAML configs in examples instead of shell scripts for a pretty view
- Remove the sha1 hash check when loading datasets
- Rename arguments:
num_layer_trainable->freeze_trainable_layers,name_module_trainable->freeze_trainable_modules
The above changes are made by @hiyouga in #3596
REMINDER: Now installation is mandatory to use LLaMA Factory
New features
- Support training and inference on the Ascend NPU 910 devices by @zhou-wjjw and @statelesshz (docker images are also provided)
- Support
stopparameter in vLLM engine by @zhaonx in #3527 - Support fine-tuning token embeddings in freeze tuning via the
freeze_extra_modulesargument - Add Llama3 quickstart to readme
New models
- Base models
- Yi-1.5 (6B/9B/34B) π
- DeepSeek-V2 (236B) π
- Instruct/Chat models
- Yi-1.5-Chat (6B/9B/34B) ππ€
- Yi-VL-Chat (6B/34B) by @BUAADreamer in #3748 ππΌοΈπ€
- Llama3-Chinese-Chat (8B/70B) ππ€
- DeepSeek-V2-Chat (236B) ππ€
Bug fix
- Add badam arguments to LlamaBoard by @codemayq in #3487
- Add openai data format to readme by @khazic in #3490
- Fix slow operation in dpo/orpo trainer by @hiyouga
- Fix badam examples by @pha123661 in #3578
- Fix download link of the nectar_rm dataset by @ZeyuTeng96 in #3588
- Add project by @Katehuuh in #3601
- Fix dockerfile by @gaussian8 in #3604
- Fix full tuning of MLLMs by @BUAADreamer in #3651
- Fix gradio environment variables by @cocktailpeanut in #3654
- Fix typo and add log in API by @Tendo33 in #3655
- Fix download link of the phi-3 model by @YUUUCC in #3683
- Fix #3559 #3560 #3602 #3603 #3606 #3625 #3650 #3658 #3674 #3694 #3702 #3724 #3728
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.7.0: LLaVA Multimodal LLM Support
Congratulations on 20k stars π We are the 1st of the GitHub Trending at Apr. 23rd π₯ Follow us at X
New features
- Support SFT/PPO/DPO/ORPO for the LLaVA-1.5 model by @BUAADreamer in #3450
- Support inferring the LLaVA-1.5 model with both native Transformers and vLLM by @hiyouga in #3454
- Support vLLM+LoRA inference for partial models (see support list)
- Support 2x faster generation of the QLoRA model based on UnslothAI's optimization
- Support adding new special tokens to the tokenizer via the
new_special_tokensargument - Support choosing the device to merge LoRA in LlamaBoard via the
export_deviceargument - Add a Colab notebook for getting into fine-tuning the Llama-3 model on a free T4 GPU
- Automatically enable SDPA attention and fast tokenizer for higher performance
New models
- Base models
- OLMo-1.7-7B
- Jamba-v0.1-51B
- Qwen1.5-110B
- DBRX-132B-Base
- Instruct/Chat models
- Phi-3-mini-3.8B-instruct (4k/128k)
- LLaVA-1.5-7B
- LLaVA-1.5-13B
- Qwen1.5-110B-Chat
- DBRX-132B-Instruct
New datasets
- Supervised fine-tuning datasets
- LLaVA mixed (en&zh) by @BUAADreamer in #3471
- Preference datasets
- DPO mixed (en&zh) by @hiyouga
Bug fix
- Fix #2093 #3333 #3347 #3374 #3387
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.6.3: Llama-3 and 3x Longer QLoRA
New features
- Support Meta Llama-3 (8B/70B) models
- Support UnslothAI's long-context QLoRA optimization (56,000 context length for Llama-2 7B in 24GB)
- Support previewing local datasets in directories in LlamaBoard by @codemayq in #3291
New algorithms
- Support BAdam algorithm by @Ledzy in #3287
- Support Mixture-of-Depths training by @mlinmg in #3338
New models
- Base models
- CodeGemma (2B/7B)
- CodeQwen1.5-7B
- Llama-3 (8B/70B)
- Mixtral-8x22B-v0.1
- Instruct/Chat models
- CodeGemma-7B-it
- CodeQwen1.5-7B-Chat
- Llama-3-Instruct (8B/70B)
- Command R (35B) by @marko1616 in #3254
- Command R+ (104B) by @marko1616 in #3254
- Mixtral-8x22B-Instruct-v0.1
Bug fix
- Fix full-tuning batch prediction examples by @khazic in #3261
- Fix outputrouterlogits of Mixtral by @liu-zichen in #3276
- Fix automodel from pretrained with attn implementation (see https://github.com/huggingface/transformers/issues/30298)
- Fix unable to convergence issue in the layerwise galore optimizer (see https://github.com/huggingface/transformers/issues/30371)
- Fix #3184 #3238 #3247 #3273 #3316 #3317 #3324 #3348 #3352 #3365 #3366
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.6.2: ORPO and Qwen1.5-32B
New features
- Support ORPO algorithm by @hiyouga in #3066
- Support inferring BNB 4-bit models on multiple GPUs via the
quantization_device_mapargument - Reorganize README files, move example scripts to the
examplesfolder - Support saving & loading arguments quickly in LlamaBoard by @hiyouga and @marko1616 in #3046
- Support load alpaca-format dataset from the hub without
dataset_info.jsonby specifying--dataset_dir ONLINE - Add a parameter
moe_aux_loss_coefto control the coefficient of auxiliary loss in MoE models.
New models
- Base models
- Breeze-7B-Base
- Qwen1.5-MoE-A2.7B (14B)
- Qwen1.5-32B
- Instruct/Chat models
- Breeze-7B-Instruct
- Qwen1.5-MoE-A2.7B-Chat (14B)
- Qwen1.5-32B-Chat
Bug fix
- Fix pile dataset download config by @lealaxy in #3053
- Fix model generation config by @marko1616 in #3057
- Fix qwen1.5 models DPO training by @changingivan and @hiyouga in #3083
- Support Qwen1.5-32B by @sliderSun in #3160
- Support Breeze-7B by @codemayq in #3161
- Fix
addtional_targetin unsloth by @kno10 in #3201 - Fix #2807 #3022 #3023 #3046 #3077 #3085 #3116 #3200 #3225
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.6.1: Patch release
This patch mainly fixes #2983
In commit 9bec3c98a22c91b1c28fda757db51eb780291641, we built the optimizer and scheduler inside the trainers, which inadvertently introduced a bug: when DeepSpeed was enabled, the trainers in transformers would build an optimizer and scheduler before calling the create_optimizer_and_scheduler method [1], then the optimizer created by our method would overwrite the original one, while the scheduler would not. Consequently, the scheduler would no longer affect the learning rate in the optimizer, leading to a regression in the training result. We have fixed this bug in 3bcd41b639899e72bcabc51d59bac8967af19899 and 8c77b1091296e204dc3c8c1f157c288ca5b236bd. Thank @HideLord for helping us identify this critical bug.
[1] https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/trainer.py#L1877-L1881
We have also fixed #2961 #2981 #2982 #2983 #2991 #3010
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.6.0: Paper Release, GaLore and FSDP+QLoRA
We released our paper on arXiv! Thanks to all co-authors and AK's recommendation
New features
- Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
- Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
- Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
- LLaMA Factory π€ vLLM, enjoy 270% inference speed with
--infer_backend vllm - Add Colab notebook for easily getting started
- Support pushing fine-tuned models to Hugging Face Hub in web UI
- Support
apply_chat_templateby adding a chat template to the tokenizer after fine-tuning - Add dockerize support by @S3Studio in #2743 #2849
New models
- Base models
- OLMo (1B/7B)
- StarCoder2 (3B/7B/15B)
- Yi-9B
- Instruct/Chat models
- OLMo-7B-Instruct
New datasets
- Supervised fine-tuning datasets
- Cosmopedia (en)
- Preference datasets
- Orca DPO (en)
Bug fix
- Fix flash_attn in web UI by @cx2333-gt in #2730
- Fix deepspeed runtime error in PPO by @stephen-nju in #2746
- Fix readme ddp instruction by @khazic in #2903
- Fix environment variable in datasets by @SirlyDreamer in #2905
- Fix readme information by @0xez in #2919
- Fix generation config validation by @marko1616 in #2945
- Fix requirements by @rkinas in #2963
- Fix bitsandbytes windows version by @Tsumugii24 in #2967
- Fix #2346 #2642 #2649 #2732 #2735 #2756 #2766 #2775 #2777 #2782 #2798 #2802 #2803 #2817 #2895 #2928 #2936 #2941
- Python
Published by hiyouga about 2 years ago
lazyllm-llamafactory - v0.5.3: DoRA and AWQ/AQLM QLoRA
New features
- Support DoRA (Weight-Decomposed LoRA)
- Support QLoRA for the AWQ/AQLM quantized models, now 2-bit QLoRA is feasible
- Provide some example scripts in https://github.com/hiyouga/LLaMA-Factory/tree/main/examples
New models
- Base models
- Gemma (2B/7B)
- Instruct/Chat models
- Gemma-it (2B/7B)
Bug fix
- Add flash-attn package for Windows user by @codemayq in #2514
- Fix ppo trainer #1163 by @stephen-nju in #2525
- Support atom models by @Rayrtfr in #2531
- Support role in webui by @lungothrin in #2575
- Bump accelerate to 0.27.2 and fix #2552 by @Katehuuh in #2608
- Fix #2512 #2516 #2532 #2533 #2629
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.5.2: Block expansion, Qwen1.5 models
New features
- Support block expansion in LLaMA Pro, see
tests/llama_pro.pyfor usage - Add
use_rsloraoption for the LoRA method
New models
- Base models
- Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
- DeepSeekMath-7B-Base
- DeepSeekCoder-7B-Base-v1.5
- Orion-14B-Base
- Instruct/Chat models
- Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
- MiniCPM-2B-SFT/DPO
- DeepSeekMath-7B-Instruct
- DeepSeekCoder-7B-Instruct-v1.5
- Orion-14B-Chat
- Orion-14B-Long-Chat
- Orion-14B-RAG-Chat
- Orion-14B-Plugin-Chat
New datasets
- Supervised fine-tuning datasets
- SlimOrca (en)
- Dolly (de)
- Dolphin (de)
- Airoboros (de)
- Preference datasets
- Orca DPO (de)
Bug fix
- Fix
torch_dtypecheck in export model by @fenglui in #2262 - Add Russian locale to LLaMA Board by @seoeaa in #2264
- Remove manually set
use_cachein export model by @yhyu13 in #2266 - Fix DeepSpeed Zero3 training with MoE models by @A-Cepheus in #2283
- Add a patch for full training of the Mixtral model using DeepSpeed Zero3 by @ftgreat in #2319
- Fix bug in data pre-processing by @lxsyz in #2411
- Add German sft and dpo datasets by @johannhartmann in #2423
- Add version checking in
test_toolcall.pyby @mini-tiger in #2435 - Enable parsing of SlimOrca dataset by @mnmueller in #2462
- Add tags for models when pushing to hf hub by @younesbelkada in #2474
- Fix #2189 #2268 #2282 #2320 #2338 #2376 #2388 #2394 #2397 #2404 #2412 #2420 #2421 #2436 #2438 #2471 #2481
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.5.0: Agent Tuning, Unsloth Integration
Congratulations on 10k stars π Make LLM fine-tuning easier and faster together with LLaMA-Factory β¨
New features
- Support agent tuning for most models, you can fine-tune any LLMs with
--dataset glaive_toolcallfor tool using #2226 - Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
- LLaMA Factory π€ Unsloth, enjoy 170% LoRA training speed with
--use_unsloth, see benchmarking here - Supports fine-tuning models on MPS device #2090
New models
- Base models
- Phi-2 (2.7B)
- InternLM2 (7B/20B)
- SOLAR-10.7B
- DeepseekMoE-16B-Base
- XVERSE-65B-2
- Instruct/Chat models
- InternLM2-Chat (7B/20B)
- SOLAR-10.7B-Instruct
- DeepseekMoE-16B-Chat
- Yuan (2B/51B/102B)
New datasets
- Supervised fine-tuning datasets
- deepctrl dataset
- Glaive function calling dataset v2
Core updates
- Refactor data engine: clearer dataset alignment, easier templating and tool formatting
- Refactor saving logic for models with value head #1789
- Use ruff code formatter for stylish code
Bug fix
- Bump transformers version to 4.36.2 by @ShaneTian in #1932
- Fix requirements by @dasdristanta13 in #2117
- Add Machine-Mindset project by @JessyTsu1 in #2163
- Fix typo in readme file by @junuMoon in #2194
- Support resize token embeddings with ZeRO3 by @liu-zichen in #2201
- Fix #1073 #1462 #1617 #1735 #1742 #1789 #1821 #1875 #1895 #1900 #1908 #1907 #1909 #1923 #2014 #2067 #2081 #2090 #2098 #2125 #2127 #2147 #2161 #2164 #2183 #2195 #2249 #2260
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ integration
π¨π¨ Core refactor
- Deprecate
checkpoint_dirand useadapter_name_or_pathinstead - Replace
resume_lora_trainingwithcreate_new_adapter - Move the patches in model loading to
llmtuner.model.patcher - Bump to Transformers 4.36.1 to adapt to the Mixtral models
- Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
- Temporarily disable LongLoRA due to breaking changes, which will be supported later
The above changes were made by @hiyouga in #1864
New features
- Add DPO-ftx: mixing fine-tuning gradients to DPO via the
dpo_ftxargument, suggested by @lylcst in https://github.com/hiyouga/LLaMA-Factory/issues/1347#issuecomment-1846943606 - Integrate AutoGPTQ into the model export via the
export_quantization_bitandexport_quantization_datasetarguments - Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
- Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b8724ffd0351a32593ab52d8a2312f339b
- Support system column in both alpaca and sharegpt dataset formats
New models
- Base models
- Mixtral-8x7B-v0.1
- Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat
Bug fix
- Improve logging for unknown arguments by @yhyu13 in #1868
- Fix an overflow issue in LLaMA2 PPO training #1742
- Fix #246 #1561 #1715 #1764 #1765 #1770 #1771 #1784 #1786 #1795 #1815 #1819 #1831
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.3.3: ModelScope integration, reward server
New features
- Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
- Support launching a reward model server in demo API via specifying
--stage=rminapi_demo.py - Support using a reward model server in PPO training via specifying
--reward_model_type api - Support adjusting the shard size of exported models via the
export_sizeargument
New models
- Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
- Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat
New datasets
- Supervised fine-tuning datasets
- Nectar dataset by @mlinmg in #1689
- Preference datasets
- Nectar dataset by @mlinmg in #1689
Bug fix
- Improve getcurrentdevice by @billvsme in #1690
- Improve web UI preview by @Samge0 in #1695
- Fix #1543 #1597 #1657 #1658 #1659 #1668 #1682 #1696 #1699 #1703 #1707 #1710
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.3.2: Patch release
New features
- Support training GPTQ quantized model #729 #1481 #1545
- Support resuming reward model training #1567
Bug fix
- Change default PPO parameters by @hannlp in #1553
- Fix ChatGLM2&3 templates #1453 #1480
- Fix #1548 by @Outsider565 in #1544
- Fix #1263 #1550 #1558
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.3.0: Full-parameter RLHF
New features
- Support full-parameter RLHF training (RM & PPO)
- Refactor llmtuner core in #1525 by @hiyouga
- Better LLaMA Board: full-parameter RLHF and demo mode
New models
- Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
- Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta
Bug fix
- Fix bugs in partial-parameter (freeze) tuning
- Fix #224 #336 #931 #936 #1011 #1489 #1494 #1507 #1514
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.2.2: Patch release
Bug fix
- Fix the OOM issue in PPO training by @mmbwf in #424
- Fix fine-tuning arguments by @yyq in #1454
- Refactor constants and evaluation by @hiyouga
- Fix #1452 #1466 #1478
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.2.1: Variant models, NEFTune trick
New features
- Support NEFTune trick for supervised fine-tuning by @anvie in #1252
- Support loading dataset in the sharegpt format - read data/readme for details
- Support generating multiple responses in demo API via the
nparameter - Support caching the pre-processed dataset files via the
cache_pathargument - Better LLaMA Board (pagination, controls, etc.)
- Support
push_to_hubargument #1088
New models
- Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
- Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)
New datasets
- Pre-training datasets
- RedPajama V2
- Pile
- Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2
Bug fix
- Fix full-parameter DPO training #1383 #1422 (inspired by @mengban )
- Fix tokenizer config by @lvzii in #1436
- Fix #1197 #1215 #1217 #1218 #1228 #1232 #1285 #1287 #1290 #1316 #1325 #1349 #1356 #1365 #1411 #1418 #1438 #1439 #1446
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.2.0: Refactor Web UI, support LongLoRA
New features
- Support LongLoRA for the LLaMA models
- Support training the Qwen-14B and InternLM-20B models
- Support training state recovery for the all-in-one Web UI
- Support Ascend NPU by @statelesshz in #975
- Integrate MMLU, C-Eval and CMMLU benchmarks
Modifications
- Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
- Use the
cutoff_lenargument instead ofmax_source_lengthandmax_target_length#944 - Add a
train_on_promptoption #1184
Bug fix
- Fix numeric error caused by the layer norm dtype in https://github.com/hiyouga/LLaMA-Factory/commit/84b7486885c600e5e65c5ba9095d56ecc2502977 [1]
- Fix bugs in PPO Trainer by @mmbwf in #900
- Fix #424 #762 #814 #887 #913 #1000 #1026 #1032 #1064 #1068 #1074 #1086 #1097 #1176 #1177 #1190 #1191
[1] https://github.com/huggingface/transformers/pull/25598#discussion_r1335345914
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.1.8: FlashAttention-2 and Baichuan2
New features
- Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 is required)
- Support training the Baichuan2 models
- Use right-padding to avoid overflow in fp16 training (also mentioned here)
- Align the computation method of the reward score with DeepSpeed-Chat (better generation)
- Support
--lora_target allargument which automatically finds the applicable modules for LoRA training
Bug fix
- Use efficient EOS tokens to align with the Baichuan training ( https://github.com/baichuan-inc/Baichuan2/issues/23 )
- Remove PeftTrainer to save model checkpoints in DeepSpeed training
- Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
- Add dataset explanation by @panpan0000 in #629
- Fix a bug in the DPO data collator
- Fix a bug of the ChatGLM2 tokenizer in right-padding
- #608 #617 #649 #757 #761 #763 #809 #818
- Python
Published by hiyouga over 2 years ago
lazyllm-llamafactory - v0.1.7: Script preview and RoPE scaling
New features
- Preview training script in Web UI by @codemayq in #479 #511
- Support resuming from checkpoints by @niuba in #434 (
transformers>=4.31.0required) - Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (
transformers>=4.31.0required) - Support training the ChatGLM2-6B model
- Support PPO training in bfloat16 data type #551
Bug fix
- Unusual output of quantized models #278 #391
- Runtime error in distributed DPO training #480
- Unexpected truncation in generation #532
- Dataset streaming error in pre-training #548 #549
- Tensor shape mismatch in PPO training using ChatGLM2 #527 #528
- #475 #476 #478 #481 #494 #551
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.6: DPO Training and Qwen-7B
- Adapt DPO training from the TRL library
- Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
- Implement the "safe" ChatML template for Qwen-7B-Chat
- Better Web UI
- Pretty readme by @codemayq #382
- New features: #395 #451
- Fix InternLM-7B inference #312
- Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.5: Patch release
- Fix LLaMA-2 template #307
- Fix bug in preprocessing 968ce0dcce6bfef582ce37aea6566a65f5aac811
- Fix #294 #296
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.4: Dataset streaming
- Support dataset streaming
- Fix LLaMA-2 #268
- Fix DeepSpeed ZeRO-3 model save #274
- Fix #242 #284
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.2: LLaMA-2 models
- Support LLaMA-2 (good issue #202 )
- Advanced configurations in Web UI
- Fix API (downgrade pydantic<2.0.0)
- Fix baichuan lora hparam #194 #212
- Fix padding #196
- Fix ZeRO-3 #199
- Allow pass args to app #213
- Code simplification
- Add ShareGPT dataset
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.1
- Web UI: sourceprefix, maxlength, dev set
- Bug fix: reward token #179
- Update template #171 #177
- Bug fix: replace the Literal type with Enum for pydantic [1] #176
- Add Web demo #180
[1] https://github.com/pydantic/pydantic/issues/5821, https://github.com/tiangolo/sqlmodel/issues/67
- Python
Published by hiyouga almost 3 years ago
lazyllm-llamafactory - v0.1.0: All-in-one Web UI
- Fix gradient accumulation in PPO Trainer https://github.com/hiyouga/ChatGLM-Efficient-Tuning/issues/299
- All-in-one Web UI by @hiyouga , @KanadeSiina and @codemayq
- Python
Published by hiyouga almost 3 years ago