https://github.com/artificialzeng/baichuan-chat-tuning
实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary
Repository
实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。
Basic Info
- Host: GitHub
- Owner: ArtificialZeng
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 95.4 MB
Statistics
- Stars: 66
- Watchers: 2
- Forks: 3
- Open Issues: 2
- Releases: 0
Metadata Files
README.md
- Python 3.8+ PyTorch 1.13.1+
- Transformers, Datasets, Accelerate, PEFT TRL
- sentencepiece tiktoken
- jieba, rouge-chinese nltk ()
- gradio matplotlib ()
- uvicorn, fastapi sse-starlette ( API)
** GPU**
data/dataset_info.json data/README.md
bash
git clone https://github.com/ArtificialZeng/Baichuan-Chat-Tuning
conda create -n baichuan_etuning python=3.10
conda activate baichuan_etuning
cd Baichuan-Chat-Tuning
pip install -r requirements.txt
Windows LoRAQLoRA bitsandbytes , CUDA 11.1 12.1.
bash
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
/
bash
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
UI ****
Baichuan(SFT - )
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--model_name_or_path path_to_your_model \
--do_train \
--dataset alpaca_gpt4_zh \
--template baichuan \
--finetuning_type lora \
--output_dir path_to_sft_checkpoint \
--overwrite_cache \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 1000 \
--learning_rate 5e-5 \
--num_train_epochs 3.0 \
--plot_loss \
--lora_target W_pack \
--fp16
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage pt \
--model_name_or_path path_to_your_model \
--do_train \
--dataset wiki_demo \
--template default \
--finetuning_type lora \
--output_dir path_to_pt_checkpoint \
--overwrite_cache \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 1000 \
--learning_rate 5e-5 \
--num_train_epochs 3.0 \
--plot_loss \
--fp16
Baichuan Efficient Tuning
[ English | ]
[23/08/12] *RoPE * LLaMA --rope_scaling linear --rope_scaling dynamic
[23/08/11] DPO
[23/08/03] Qwen-7B --model_name_or_path Qwen/Qwen-7B-Chat --lora_target c_attn Qwen-7B-Chat --template chatml
[23/07/31] **** --streaming --max_steps 10000
[23/07/29] Hugging Face 13B Hugging Face LLaMA-2 / Baichuan
[23/07/19] LLaMA-2 --model_name_or_path meta-baichuan/Llama-2-7b-hf LLaMA-2-chat --template baichuan2
[23/07/18] **** train_web.py @KanadeSiina @codemayq
[23/07/11] Baichuan-13B --model_name_or_path baichuan-inc/Baichuan-13B-Base --lora_target W_pack Baichuan-13B-Chat --template baichuan
[23/07/07] InternLM-7B --model_name_or_path internlm/internlm-7b InternLM-chat --template intern
[23/07/05] Falcon-7B/40B --model_name_or_path tiiuae/falcon-7b --lora_target query_key_value
[23/06/29] **** Hugging Face
[23/06/22] API OpenAI API ** ChatGPT **
[23/06/15] Baichuan-7B --model_name_or_path baichuan-inc/Baichuan-7B --lora_target W_pack
[23/06/03] 4 LoRA QLoRA --quantization_bit 4 4
[23/05/31] BLOOM & BLOOMZ --model_name_or_path bigscience/bloomz-7b1-mt --lora_target query_key_value
| | | | Template | | -------------------------------------------------------- | --------------------------- | ----------------- |----------| | LLaMA | 7B/13B/33B/65B | qproj,vproj | - | | LLaMA-2 | 7B/13B/70B | qproj,vproj | baichuan2 | | BLOOM | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | BLOOMZ | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | Falcon | 7B/40B | querykeyvalue | - | | Baichuan | 7B/13B | Wpack | baichuan | | InternLM | 7B | qproj,vproj | intern | | Qwen | 7B | cattn | chatml | | XVERSE | 13B | qproj,vproj | - | | ChatGLM2 | 6B | querykeyvalue | chatglm2 |
- ****
--lora_targetpython src/train_bash.py -h - Base
--templatedefault,alpaca,vicunaChat
| | | | LoRA | QLoRA | | ---------- | ---------- | ----------- | ---- | ----- | | | | | | | | | | | | | | | | | | | | PPO | | | | | | DPO | | | | |
-
--quantization_bit 4/8QLoRA
-
- Stanford Alpaca (en)
- Stanford Alpaca (zh)
- GPT-4 Generated Data (en&zh)
- Open Assistant (multilingual)
- Self-cognition (zh)
- ShareGPT (zh)
- Guanaco Dataset (multilingual)
- BELLE 2M (zh)
- BELLE 1M (zh)
- BELLE 0.5M (zh)
- BELLE Dialogue 0.4M (zh)
- BELLE School Math 0.25M (zh)
- BELLE Multiturn Chat 0.8M (zh)
- Firefly 1.1M (zh)
- LIMA (en)
- CodeAlpaca 20k (en)
- Alpaca CoT (multilingual)
- Web QA (zh)
- UltraChat (en)
- WebNovel (zh)
- DPO
Hugging Face
bash
pip install --upgrade huggingface_hub
huggingface-cli login
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage rm \
--model_name_or_path path_to_your_model \
--do_train \
--dataset comparison_gpt4_zh \
--template default \
--finetuning_type lora \
--resume_lora_training False \
--checkpoint_dir path_to_sft_checkpoint \
--output_dir path_to_rm_checkpoint \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 1000 \
--learning_rate 1e-5 \
--num_train_epochs 1.0 \
--plot_loss \
--fp16
PPO
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage ppo \
--model_name_or_path path_to_your_model \
--do_train \
--dataset alpaca_gpt4_zh \
--template default \
--finetuning_type lora \
--resume_lora_training False \
--checkpoint_dir path_to_sft_checkpoint \
--reward_model path_to_rm_checkpoint \
--output_dir path_to_ppo_checkpoint \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 1000 \
--learning_rate 1e-5 \
--num_train_epochs 1.0 \
--plot_loss
DPO
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage dpo \
--model_name_or_path path_to_your_model \
--do_train \
--dataset comparison_gpt4_zh \
--template default \
--finetuning_type lora \
--resume_lora_training False \
--checkpoint_dir path_to_sft_checkpoint \
--output_dir path_to_dpo_checkpoint \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 1000 \
--learning_rate 1e-5 \
--num_train_epochs 1.0 \
--plot_loss \
--fp16
GPU
Huggingface Accelerate
bash
accelerate config #
accelerate launch src/train_bash.py #
DeepSpeed ZeRO-2 Accelerate
```yaml compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 4 gradient_clipping: 0.5 offload_optimizer_device: none offload_param_device: none zero3_init_flag: false zero_stage: 2 distributed_type: DEEPSPEED downcast_bf16: 'no' machine_rank: 0 main_training_function: main mixed_precision: fp16 num_machines: 1 num_processes: 4 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false ```DeepSpeed
bash
deepspeed --num_gpus 8 --master_port=9901 src/train_bash.py \
--deepspeed ds_config.json \
... #
DeepSpeed ZeRO-2 DeepSpeed
```json { "train_micro_batch_size_per_gpu": "auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "zero_allow_untested_optimizer": true, "fp16": { "enabled": "auto", "loss_scale": 0, "initial_scale_power": 16, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "zero_optimization": { "stage": 2, "allgather_partitions": true, "allgather_bucket_size": 5e8, "reduce_scatter": true, "reduce_bucket_size": 5e8, "overlap_comm": false, "contiguous_gradients": true } } ```BLEU ROUGE
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--model_name_or_path path_to_your_model \
--do_eval \
--dataset alpaca_gpt4_zh \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint \
--output_dir path_to_eval_result \
--per_device_eval_batch_size 8 \
--max_samples 100 \
--predict_with_generate
--per_device_eval_batch_size=1 --max_target_length 128
bash
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--model_name_or_path path_to_your_model \
--do_predict \
--dataset alpaca_gpt4_zh \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint \
--output_dir path_to_predict_result \
--per_device_eval_batch_size 8 \
--max_samples 100 \
--predict_with_generate
API
bash
python src/api_demo.py \
--model_name_or_path path_to_your_model \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint
API http://localhost:8000/docs
bash
python src/cli_demo.py \
--model_name_or_path path_to_your_model \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint
bash
python src/web_demo.py \
--model_name_or_path path_to_your_model \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint
bash
python src/export_model.py \
--model_name_or_path path_to_your_model \
--template default \
--finetuning_type lora \
--checkpoint_dir path_to_checkpoint \
--output_dir path_to_export
TODO
bibtex
@Misc{baichuan-efficient-tuning,
title = {LLaMA Efficient Tuning},
author = {hiyouga},
howpublished = {\url{https://github.com/hiyouga/Baichuan-Chat-Tuning}},
year = {2023}
}
Star History
Owner
- Name: Dr. Artificial曾小健
- Login: ArtificialZeng
- Kind: user
- Location: Beijing
- Website: https://blog.csdn.net/sinat_37574187?type=blog
- Repositories: 171
- Profile: https://github.com/ArtificialZeng
LLM practitioner/engineer, AI/ML/DL Quant
GitHub Events
Total
- Watch event: 2
- Fork event: 1
Last Year
- Watch event: 2
- Fork event: 1
Dependencies
- accelerate >=0.21.0
- datasets >=2.12.0
- fastapi ==0.95.1
- gradio >=3.36.0
- jieba *
- matplotlib *
- nltk *
- peft >=0.4.0
- pydantic ==1.10.11
- rouge-chinese *
- scipy *
- sentencepiece *
- sse-starlette *
- tiktoken *
- torch >=1.13.1
- transformers >=4.29.1
- trl >=0.5.0
- uvicorn *