https://github.com/artificialzeng/chatglm-finetuning

基于ChatGLM-6B模型，进行下游具体任务微调，涉及Freeze、Lora、P-tuning等

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (2.1%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

基于ChatGLM-6B模型，进行下游具体任务微调，涉及Freeze、Lora、P-tuning等

Basic Info

Host: GitHub
Owner: ArtificialZeng
Default Branch: master
Homepage:
Size: 1.2 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Fork of liucongg/ChatGLM-Finetuning

Created almost 3 years ago · Last pushed about 3 years ago

https://github.com/ArtificialZeng/ChatGLM-Finetuning/blob/master/

## ChatGLM
ChatGLM

ChatGLM820-821

****

[](https://pan.baidu.com/s/1-UrZWnqw6Ciyo5K2NLraDg)jh0l

- update-2023.06.12 [****](https://zhuanlan.zhihu.com/p/636488690)
- update-2023.04.18 ****
- update-2023.04.05 ****

## 
### Freeze
FreezeTPPP

finetuning_freeze.py
```python3
for name, param in model.named_parameters():
    if not any(nd in name for nd in ["layers.27", "layers.26", "layers.25", "layers.24", "layers.23"]):
        param.requires_grad = False
```

DeepSpeedtrain_pathmodel_dirnum_train_epochstrain_batch_sizegradient_accumulation_stepsoutput_dirprompt_text

```
CUDA_VISIBLE_DEVICES=0 deepspeed finetuning_freeze.py --num_train_epochs 5 --train_batch_size 2
```
predict_freeze.py

### PT
PTP-Tuning[ChatGLM](https://github.com/THUDM/ChatGLM-6B/blob/main/ptuning/README.md) soft-prompt

![](images/PT.png)
- P-TuningEmbedding[paper](https://arxiv.org/abs/2103.10385)
- P-Tuning-V2Embedding[paper](https://arxiv.org/abs/2110.07602)
finetuning_pt.py
```python3
config = ChatGLMConfig.from_pretrained(args.model_dir)
config.pre_seq_len = args.pre_seq_len
config.prefix_projection = args.prefix_projection

model = ChatGLMForConditionalGeneration.from_pretrained(args.model_dir, config=config)

for name, param in model.named_parameters():
    if not any(nd in name for nd in ["prefix_encoder"]):
        param.requires_grad = False
```
prefix_projectionTrueP-Tuning-V2EmbeddingFalseP-TuningEmbedding

DeepSpeedtrain_pathmodel_dirnum_train_epochstrain_batch_sizegradient_accumulation_stepsoutput_dirprompt_textpre_seq_lenprompt_text

```
CUDA_VISIBLE_DEVICES=0 deepspeed finetuning_pt.py --num_train_epochs 5 --train_batch_size 2 --pre_seq_len 16
```
predict_pt.py

### Lora
Lora
tuning

![](images/Lora.png)
- [paper](https://arxiv.org/abs/2106.09685)
- [Github](https://github.com/microsoft/LoRA)
- HuggingFacepeft[Github](https://github.com/huggingface/peft)

finetuning_lora.py
```python3
model = ChatGLMForConditionalGeneration.from_pretrained(args.model_dir)
config = LoraConfig(r=args.lora_r,
                    lora_alpha=32,
                    target_modules=["query_key_value"],
                    lora_dropout=0.1,
                    bias="none",
                    task_type="CAUSAL_LM",
                    inference_mode=False,
                    )

model = get_peft_model(model, config)
```
DeepSpeedtrain_pathmodel_dirnum_train_epochstrain_batch_sizegradient_accumulation_stepsoutput_dirprompt_textlora_r

```
CUDA_VISIBLE_DEVICES=0 deepspeed finetuning_lora.py --num_train_epochs 5 --train_batch_size 2 --lora_r 8
```
predict_lora.py

adapter_config.jsoninference_modefalsemodel.eval()
chatglmConv1D

### 
requirements.txt

## 
### 
- -[](https://www.datafountain.cn/competitions/584)50
- 768Batch25fp16DeepSpeedZero-1
- PTP-Tuning V2PT-Only-EmbeddingEmbeddingsoft-promptFreezeLora8
- PT48G-A40OOMPTgradient_checkpointing_enable()
- 
```
prompt_text\"\", \"\", \"\" \"\"\"_\"\\n

__\n__
```


|  |  PT-Only-Embedding |  PT | Freeze |  Lora | 
| ------- | ------ | ------  | ------ | ------ |
|  | 37G | 30G | 24G | 39G |
|  | 6.259B | 7.211B | 6.255B | 6.259B |
|  | 0.0586% | 13.26% | 16.10% | 0.0586% |
|  | 53min | 135min | 112min | 65min |
| F1 | 0.0 | 0.6283 | 0.5675 | 0.5359 |
|  | 191s | 198s | 180s | 278s |


- PT>Freeze>Lora>PT-Only-Embedding
- PT-Only-Embeddingloss2.0.Embedding
- PT
- Freeze
- 
- -
- instruction

freezetest_forgetting.py


![](images/ft_fanyi.png)




![](images/ft_code.png)




![](images/ft_qa.png)



### 
- -[](https://tianchi.aliyun.com/competition/entrance/531826/introduction)20
- PTP-Tuning V2PT-Only-EmbeddingEmbeddingsoft-promptFreezeLora8
- 
```
prompt_text
5-820

```
freeze
```
CUDA_VISIBLE_DEVICES=0 nohup deepspeed --master_port 5555 finetuning_freeze.py --train_path "data/d2q_0.json" --output_dir "output_dir_freeze/" --prompt_text "" > log_fz.log 2>&1 &
```

BLUERouge D2Q520
- 
	- 0.25
	- 0.25
- 
	- 0.25
	- 0.25
	- 0.25
	
d2q_result_data/predict_d2q.py

|  |   | PT-Only-Embedding |  PT | Freeze |  Lora | 
| ------- | ------ | ------ | ------  | ------ | ------ |
|  | 51.75 | 73.75 | 87.75 | 79.25 | 86.75 |


### 


## 
[Pipeline](https://zhuanlan.zhihu.com/p/636488690)

Githubtrain_pipeline.py 
```
CUDA_VISIBLE_DEVICES=0,1,2,3 deepspeed --master_port 5524 train_pipeline.py --train_path data/spo_0.json --model_name_or_path ./ChatGLM-6B/ --per_device_train_batch_size 14 --max_len 1024 --max_src_len 512 --num_train_epochs 5 --gradient_accumulation_steps 1 --seed 1234 --show_loss_step 20 --num_stages 4 --save_model_step 100 --output_dir ./output-glm-pp
```
Githubconvert_model_to_hf.py
```
python3 convert_model_to_hf.py --ori_model_dir ./ChatGLM-6B/ --pipeline_model_dir output-glm-pp/global_step300/ --save_model_dir output-glm-pp/gs300/
```
|  |  100 |  200 | 300 |  400 |  500 | 
| ------- | ------ | ------  | ------ | ------ | ------ |
| F1 | 0.4931 | 0.5132 | 0.5882 | 0.5793 | 0.5874 |

PTFreezeLora

Owner

Name: Dr. Artificial曾小健
Login: ArtificialZeng
Kind: user
Location: Beijing

Website: https://blog.csdn.net/sinat_37574187?type=blog
Repositories: 171
Profile: https://github.com/ArtificialZeng

LLM practitioner/engineer, AI/ML/DL Quant

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/artificialzeng/chatglm-finetuning

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/ArtificialZeng/ChatGLM-Finetuning/blob/master/

Owner

GitHub Events

Total

Last Year