https://github.com/artificialzeng/baichuan-chat-tuning

实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。

https://github.com/artificialzeng/baichuan-chat-tuning

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。

Basic Info
  • Host: GitHub
  • Owner: ArtificialZeng
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 95.4 MB
Statistics
  • Stars: 66
  • Watchers: 2
  • Forks: 3
  • Open Issues: 2
  • Releases: 0
Created almost 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License

README.md

  • Python 3.8+ PyTorch 1.13.1+
  • Transformers, Datasets, Accelerate, PEFT TRL
  • sentencepiece tiktoken
  • jieba, rouge-chinese nltk ()
  • gradio matplotlib ()
  • uvicorn, fastapi sse-starlette ( API)

** GPU**

data/example_dataset .json

data/dataset_info.json data/README.md

bash git clone https://github.com/ArtificialZeng/Baichuan-Chat-Tuning conda create -n baichuan_etuning python=3.10 conda activate baichuan_etuning cd Baichuan-Chat-Tuning pip install -r requirements.txt

Windows LoRAQLoRA bitsandbytes , CUDA 11.1 12.1.

bash pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl

/

bash CUDA_VISIBLE_DEVICES=0 python src/train_web.py

UI ****

Baichuan(SFT - )

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_train \ --dataset alpaca_gpt4_zh \ --template baichuan \ --finetuning_type lora \ --output_dir path_to_sft_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --lora_target W_pack \ --fp16

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage pt \ --model_name_or_path path_to_your_model \ --do_train \ --dataset wiki_demo \ --template default \ --finetuning_type lora \ --output_dir path_to_pt_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --fp16

Baichuan Efficient Tuning

GitHub Repo stars GitHub Code License GitHub last commit PyPI GitHub pull request

[ English | ]

[23/08/12] *RoPE * LLaMA --rope_scaling linear --rope_scaling dynamic

[23/08/11] DPO

[23/08/03] Qwen-7B --model_name_or_path Qwen/Qwen-7B-Chat --lora_target c_attn Qwen-7B-Chat --template chatml

[23/07/31] **** --streaming --max_steps 10000

[23/07/29] Hugging Face 13B Hugging Face LLaMA-2 / Baichuan

[23/07/19] LLaMA-2 --model_name_or_path meta-baichuan/Llama-2-7b-hf LLaMA-2-chat --template baichuan2

[23/07/18] **** train_web.py @KanadeSiina @codemayq

[23/07/11] Baichuan-13B --model_name_or_path baichuan-inc/Baichuan-13B-Base --lora_target W_pack Baichuan-13B-Chat --template baichuan

[23/07/09] FastEdit FastEdit

[23/07/07] InternLM-7B --model_name_or_path internlm/internlm-7b InternLM-chat --template intern

[23/07/05] Falcon-7B/40B --model_name_or_path tiiuae/falcon-7b --lora_target query_key_value

[23/06/29] **** Hugging Face

[23/06/22] API OpenAI API ** ChatGPT **

[23/06/15] Baichuan-7B --model_name_or_path baichuan-inc/Baichuan-7B --lora_target W_pack

[23/06/03] 4 LoRA QLoRA --quantization_bit 4 4

[23/05/31] BLOOM & BLOOMZ --model_name_or_path bigscience/bloomz-7b1-mt --lora_target query_key_value

| | | | Template | | -------------------------------------------------------- | --------------------------- | ----------------- |----------| | LLaMA | 7B/13B/33B/65B | qproj,vproj | - | | LLaMA-2 | 7B/13B/70B | qproj,vproj | baichuan2 | | BLOOM | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | BLOOMZ | 560M/1.1B/1.7B/3B/7.1B/176B | querykeyvalue | - | | Falcon | 7B/40B | querykeyvalue | - | | Baichuan | 7B/13B | Wpack | baichuan | | InternLM | 7B | qproj,vproj | intern | | Qwen | 7B | cattn | chatml | | XVERSE | 13B | qproj,vproj | - | | ChatGLM2 | 6B | querykeyvalue | chatglm2 |

  • **** --lora_target python src/train_bash.py -h
  • Base--template default, alpaca, vicuna Chat

| | | | LoRA | QLoRA | | ---------- | ---------- | ----------- | ---- | ----- | | | | | | | | | | | | | | | | | | | | PPO | | | | | | DPO | | | | |

  • --quantization_bit 4/8 QLoRA

data/README.md

Hugging Face

bash pip install --upgrade huggingface_hub huggingface-cli login

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage rm \ --model_name_or_path path_to_your_model \ --do_train \ --dataset comparison_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --output_dir path_to_rm_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss \ --fp16

PPO

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage ppo \ --model_name_or_path path_to_your_model \ --do_train \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --reward_model path_to_rm_checkpoint \ --output_dir path_to_ppo_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss

DPO

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage dpo \ --model_name_or_path path_to_your_model \ --do_train \ --dataset comparison_gpt4_zh \ --template default \ --finetuning_type lora \ --resume_lora_training False \ --checkpoint_dir path_to_sft_checkpoint \ --output_dir path_to_dpo_checkpoint \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 1e-5 \ --num_train_epochs 1.0 \ --plot_loss \ --fp16

GPU

Huggingface Accelerate

bash accelerate config # accelerate launch src/train_bash.py #

DeepSpeed ZeRO-2 Accelerate ```yaml compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 4 gradient_clipping: 0.5 offload_optimizer_device: none offload_param_device: none zero3_init_flag: false zero_stage: 2 distributed_type: DEEPSPEED downcast_bf16: 'no' machine_rank: 0 main_training_function: main mixed_precision: fp16 num_machines: 1 num_processes: 4 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false ```

DeepSpeed

bash deepspeed --num_gpus 8 --master_port=9901 src/train_bash.py \ --deepspeed ds_config.json \ ... #

DeepSpeed ZeRO-2 DeepSpeed ```json { "train_micro_batch_size_per_gpu": "auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "zero_allow_untested_optimizer": true, "fp16": { "enabled": "auto", "loss_scale": 0, "initial_scale_power": 16, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "zero_optimization": { "stage": 2, "allgather_partitions": true, "allgather_bucket_size": 5e8, "reduce_scatter": true, "reduce_bucket_size": 5e8, "overlap_comm": false, "contiguous_gradients": true } } ```

BLEU ROUGE

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_eval \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_eval_result \ --per_device_eval_batch_size 8 \ --max_samples 100 \ --predict_with_generate

--per_device_eval_batch_size=1 --max_target_length 128

bash CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_predict \ --dataset alpaca_gpt4_zh \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_predict_result \ --per_device_eval_batch_size 8 \ --max_samples 100 \ --predict_with_generate

API

bash python src/api_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

API http://localhost:8000/docs

bash python src/cli_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

bash python src/web_demo.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint

bash python src/export_model.py \ --model_name_or_path path_to_your_model \ --template default \ --finetuning_type lora \ --checkpoint_dir path_to_checkpoint \ --output_dir path_to_export

TODO

Apache-2.0

bibtex @Misc{baichuan-efficient-tuning, title = {LLaMA Efficient Tuning}, author = {hiyouga}, howpublished = {\url{https://github.com/hiyouga/Baichuan-Chat-Tuning}}, year = {2023} }

ChatGLM-Efficient-Tuning

Star History

Star History Chart

Owner

  • Name: Dr. Artificial曾小健
  • Login: ArtificialZeng
  • Kind: user
  • Location: Beijing

LLM practitioner/engineer, AI/ML/DL Quant

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Dependencies

pyproject.toml pypi
requirements.txt pypi
  • accelerate >=0.21.0
  • datasets >=2.12.0
  • fastapi ==0.95.1
  • gradio >=3.36.0
  • jieba *
  • matplotlib *
  • nltk *
  • peft >=0.4.0
  • pydantic ==1.10.11
  • rouge-chinese *
  • scipy *
  • sentencepiece *
  • sse-starlette *
  • tiktoken *
  • torch >=1.13.1
  • transformers >=4.29.1
  • trl >=0.5.0
  • uvicorn *
setup.py pypi