https://github.com/artificialzeng/baichuan-chat-tuning

实现了Baichuan-Chat微调，Lora、QLora等各种微调方式，一键运行。

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

实现了Baichuan-Chat微调，Lora、QLora等各种微调方式，一键运行。

Basic Info

Host: GitHub
Owner: ArtificialZeng
License: apache-2.0
Language: Python
Default Branch: main
Size: 95.4 MB

Statistics

Stars: 66
Watchers: 2
Forks: 3
Open Issues: 2
Releases: 0

Created almost 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme License

README.md

Python 3.8+ PyTorch 1.13.1+
Transformers, Datasets, Accelerate, PEFT TRL
sentencepiece tiktoken
jieba, rouge-chinese nltk ()
gradio matplotlib ()
uvicorn, fastapi sse-starlette ( API)

** GPU**

data/example_dataset .json

data/dataset_info.json data/README.md

bash git clone https://github.com/ArtificialZeng/Baichuan-Chat-Tuning conda create -n baichuan_etuning python=3.10 conda activate baichuan_etuning cd Baichuan-Chat-Tuning pip install -r requirements.txt

Windows LoRAQLoRA bitsandbytes , CUDA 11.1 12.1.

bash pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl

/

bash CUDA_VISIBLE_DEVICES=0 python src/train_web.py

UI ****

Baichuan(SFT - )

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_train \ --dataset alpaca_gpt4_zh \ --template baichuan \ --finetuning_type lora \ --output_dir path_to_sft_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --lora_target W_pack \ --fp16

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage pt \ --model_name_or_path path_to_your_model \ --do_train \ --dataset wiki_demo \ --template default \ --finetuning_type lora \ --output_dir path_to_pt_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --fp16

Baichuan Efficient Tuning

[ English | ]

[23/08/12] *RoPE * LLaMA --rope_scaling linear --rope_scaling dynamic

[23/08/11] DPO

[23/08/03] Qwen-7B --model_name_or_path Qwen/Qwen-7B-Chat --lora_target c_attn Qwen-7B-Chat --template chatml

[23/07/31] **** --streaming --max_steps 10000

[23/07/29] Hugging Face 13B Hugging Face LLaMA-2 / Baichuan

[23/07/19] LLaMA-2 --model_name_or_path meta-baichuan/Llama-2-7b-hf LLaMA-2-chat --template baichuan2

[23/07/18] **** train_web.py @KanadeSiina @codemayq

[23/07/11] Baichuan-13B --model_name_or_path baichuan-inc/Baichuan-13B-Base --lora_target W_pack Baichuan-13B-Chat --template baichuan

[23/07/09] FastEdit FastEdit

[23/07/07] InternLM-7B --model_name_or_path internlm/internlm-7b InternLM-chat --template intern

[23/07/05] Falcon-7B/40B --model_name_or_path tiiuae/falcon-7b --lora_target query_key_value

[23/06/29] **** Hugging Face

[23/06/22] API OpenAI API ** ChatGPT **

[23/06/15] Baichuan-7B --model_name_or_path baichuan-inc/Baichuan-7B --lora_target W_pack

[23/06/03] 4 LoRA QLoRA --quantization_bit 4 4

[23/05/31] BLOOM & BLOOMZ --model_name_or_path bigscience/bloomz-7b1-mt --lora_target query_key_value

| | | | Template | | -------------------------------------------------------- | --------------------------- | ----------------- |----------| | LLaMA | 7B/13B/33B/65B | qproj,vproj | - | | LLaMA-2 | 7B/13B/70B | qproj,vproj | baichuan2 | | BLOOM | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | BLOOMZ | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | Falcon | 7B/40B | querykeyvalue | - | | Baichuan | 7B/13B | Wpack | baichuan | | InternLM | 7B | qproj,vproj | intern | | Qwen | 7B | cattn | chatml | | XVERSE | 13B | qproj,vproj | - | | ChatGLM2 | 6B | querykeyvalue | chatglm2 |

**** --lora_target python src/train_bash.py -h
Base--template default, alpaca, vicuna Chat

| | | | LoRA | QLoRA | | ---------- | ---------- | ----------- | ---- | ----- | | | | | | | | | | | | | | | | | | | | PPO | | | | | | DPO | | | | |

--quantization_bit 4/8 QLoRA

data/README.md

Hugging Face

bash pip install --upgrade huggingface_hub huggingface-cli login

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage rm \ --model_name_or_path path_to_your_model \ --do_train \ --dataset comparison_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --output_dir path_to_rm_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss \ --fp16

PPO

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage ppo \ --model_name_or_path path_to_your_model \ --do_train \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --reward_model path_to_rm_checkpoint \ --output_dir path_to_ppo_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss

DPO

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage dpo \ --model_name_or_path path_to_your_model \ --do_train \ --dataset comparison_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --output_dir path_to_dpo_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss \ --fp16

GPU

Huggingface Accelerate

bash accelerate config # accelerate launch src/train_bash.py #

DeepSpeed ZeRO-2 Accelerate

```yaml compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 4 gradient_clipping: 0.5 offload_optimizer_device: none offload_param_device: none zero3_init_flag: false zero_stage: 2 distributed_type: DEEPSPEED downcast_bf16: 'no' machine_rank: 0 main_training_function: main mixed_precision: fp16 num_machines: 1 num_processes: 4 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false ```

DeepSpeed

bash deepspeed --num_gpus 8 --master_port=9901 src/train_bash.py \ --deepspeed ds_config.json \ ... #

DeepSpeed ZeRO-2 DeepSpeed

```json { "train_micro_batch_size_per_gpu": "auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "zero_allow_untested_optimizer": true, "fp16": { "enabled": "auto", "loss_scale": 0, "initial_scale_power": 16, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "zero_optimization": { "stage": 2, "allgather_partitions": true, "allgather_bucket_size": 5e8, "reduce_scatter": true, "reduce_bucket_size": 5e8, "overlap_comm": false, "contiguous_gradients": true } } ```

BLEU ROUGE

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_eval \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_eval_result \ --per_device_eval_batch_size 8 \ --max_samples 100 \ --predict_with_generate

--per_device_eval_batch_size=1 --max_target_length 128

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_predict \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_predict_result \ --per_device_eval_batch_size 8 \ --max_samples 100 \ --predict_with_generate

API

bash python src/api_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

API http://localhost:8000/docs

bash python src/cli_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

bash python src/web_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

bash python src/export_model.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_export

TODO

[ ] flash attention (torch / xformers / flashattn)
[ ] Multi-query attention
[ ] RLHF

Apache-2.0

bibtex @Misc{baichuan-efficient-tuning, title = {LLaMA Efficient Tuning}, author = {hiyouga}, howpublished = {\url{https://github.com/hiyouga/Baichuan-Chat-Tuning}}, year = {2023} }

ChatGLM-Efficient-Tuning

Star History

Owner

Name: Dr. Artificial曾小健
Login: ArtificialZeng
Kind: user
Location: Beijing

Website: https://blog.csdn.net/sinat_37574187?type=blog
Repositories: 171
Profile: https://github.com/ArtificialZeng

LLM practitioner/engineer, AI/ML/DL Quant

GitHub Events

Total

Watch event: 2
Fork event: 1

Last Year

Watch event: 2
Fork event: 1

Dependencies

pyproject.toml pypi

requirements.txt pypi

accelerate >=0.21.0
datasets >=2.12.0
fastapi ==0.95.1
gradio >=3.36.0
jieba *
matplotlib *
nltk *
peft >=0.4.0
pydantic ==1.10.11
rouge-chinese *
scipy *
sentencepiece *
sse-starlette *
tiktoken *
torch >=1.13.1
transformers >=4.29.1
trl >=0.5.0
uvicorn *

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/artificialzeng/baichuan-chat-tuning

Science Score: 10.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

/

Baichuan(SFT - )

Baichuan Efficient Tuning

PPO

DPO

GPU

Huggingface Accelerate

DeepSpeed

BLEU ROUGE

API

TODO

Star History

Owner

GitHub Events

Total

Last Year

Dependencies