https://github.com/artificialzeng/baichuan2
A series of large language models developed by Baichuan Intelligent Technology
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.8%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
A series of large language models developed by Baichuan Intelligent Technology
Basic Info
- Host: GitHub
- Owner: ArtificialZeng
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://huggingface.co/baichuan-inc
- Size: 4.5 MB
Statistics
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of baichuan-inc/Baichuan2
Created almost 3 years ago
· Last pushed almost 3 years ago
https://github.com/ArtificialZeng/Baichuan2/blob/main/
Baichuan 2
Hugging Face ModelScope WeChat
[53B](https://www.baichuan-ai.com/) [](https://github.com/baichuan-inc/Baichuan2/blob/main/LICENSE)# - [ ](#) - [ Benchmark ](#Benchmark-) - [ ](#) - [ ](#) - [ Checkpoints ](#-Checkpoints) - [ ](#) - [ ](#) # - Baichuan 2 **** **2.6 ** Tokens - Baichuan 2 benchmark **** - **7B****13B** **Base** **Chat** Chat **4bits ** - ****[](#) - [Baichuan 2: Open Large-scale Language Models](https://cdn.baichuan-ai.com/paper/Baichuan2-technical-report.pdf) | | | | 4bits | |:-------:|:-------:|:-------:|:-----------------:| | 7B | [Baichuan2-7B-Base](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base) | [Baichuan2-7B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat) | [Baichuan2-7B-Chat-4bits](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat-4bits) | | 13B | [Baichuan2-13B-Base](https://huggingface.co/baichuan-inc/Baichuan2-13B-Base) | [Baichuan2-13B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat) | [Baichuan2-13B-Chat-4bits](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat-4bits) | # Benchmark [](#)[](#)[](#)[](#)[](#)[](#) ## 5-shot - [C-Eval](https://cevalbenchmark.com/index.html#home) 52 dev few-shot test [Baichuan-7B](https://github.com/baichuan-inc/Baichuan-7B/tree/main) - [MMLU](https://arxiv.org/abs/2009.03300) 57 LLM [](https://github.com/hendrycks/test) - [CMMLU](https://github.com/haonan-li/CMMLU) 67 [](https://github.com/haonan-li/CMMLU) - [Gaokao](https://github.com/OpenLMLab/GAOKAO-Bench) C-Eval - [AGIEval](https://github.com/microsoft/AGIEval) C-Eval - [BBH](https://huggingface.co/datasets/lukaemon/bbh) Big-Bench Big-Bench 204 BBH 204 Big-Bench ### 7B | | **C-Eval** | **MMLU** | **CMMLU** | **Gaokao** | **AGIEval** | **BBH** | |:---------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:| | | 5-shot | 5-shot | 5-shot | 5-shot | 5-shot | 3-shot | | **GPT-4** | 68.40 | 83.93 | 70.33 | 66.15 | 63.27 | 75.12 | | **GPT-3.5 Turbo** | 51.10 | 68.54 | 54.06 | 47.07 | 46.13 | 61.59 | | **LLaMA-7B** | 27.10 | 35.10 | 26.75 | 27.81 | 28.17 | 32.38 | | **LLaMA2-7B** | 28.90 | 45.73 | 31.38 | 25.97 | 26.53 | 39.16 | | **MPT-7B** | 27.15 | 27.93 | 26.00 | 26.54 | 24.83 | 35.20 | | **Falcon-7B** | 24.23 | 26.03 | 25.66 | 24.24 | 24.10 | 28.77 | | **ChatGLM2-6B** | 50.20 | 45.90 | 49.00 | 49.44 | 45.28 | 31.65 | | **Baichuan-7B** | 42.80 | 42.30 | 44.02 | 36.34 | 34.44 | 32.48 | | **Baichuan2-7B-Base** | 54.00 | 54.16 | 57.07 | 47.47 | 42.73 | 41.56 | ### 13B | | **C-Eval** | **MMLU** | **CMMLU** | **Gaokao** | **AGIEval** | **BBH** | |:---------------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:| | | 5-shot | 5-shot | 5-shot | 5-shot | 5-shot | 3-shot | | **GPT-4** | 68.40 | 83.93 | 70.33 | 66.15 | 63.27 | 75.12 | | **GPT-3.5 Turbo** | 51.10 | 68.54 | 54.06 | 47.07 | 46.13 | 61.59 | | **LLaMA-13B** | 28.50 | 46.30 | 31.15 | 28.23 | 28.22 | 37.89 | | **LLaMA2-13B** | 35.80 | 55.09 | 37.99 | 30.83 | 32.29 | 46.98 | | **Vicuna-13B** | 32.80 | 52.00 | 36.28 | 30.11 | 31.55 | 43.04 | | **Chinese-Alpaca-Plus-13B** | 38.80 | 43.90 | 33.43 | 34.78 | 35.46 | 28.94 | | **XVERSE-13B** | 53.70 | 55.21 | 58.44 | 44.69 | 42.54 | 38.06 | | **Baichuan-13B-Base** | 52.40 | 51.60 | 55.30 | 49.69 | 43.20 | 43.01 | | **Baichuan2-13B-Base** | 58.10 | 59.17 | 61.97 | 54.33 | 48.17 | 48.78 | ## [JEC-QA](https://jecqa.thunlp.org/) JEC-QA C-Eval C-EvalMMLUCMMLU[MedQA](https://arxiv.org/abs/2009.13081) [MedMCQA](https://medmcqa.github.io/) C-Eval - C-Eval val - MedQA [MedQA](https://huggingface.co/datasets/bigbio/med_qa) USMLE MCMLE - MedMCQA test dev - - C-Eval: clinical_medicine, basic_medicine - MMLU: clinical_knowledge, anatomy, college_medicine, college_biology, nutrition, virology, medical_genetics, professional_medicine - CMMLU: anatomy, clinical_knowledge, college_medicine, genetics, nutrition, traditional_chinese_medicine, virology 5-shot ### 7B | | **JEC-QA** | **CEval-MMLU-CMMLU** | **MedQA-USMLE** | **MedQA-MCMLE** | **MedMCQA** | |:---------------------:|:----------:|:--------------------:|:---------------:|:---------------:|:-----------:| | | 5-shot | 5-shot | 5-shot | 5-shot | 5-shot | | **GPT-4** | 59.32 | 77.16 | 80.28 | 74.58 | 72.51 | | **GPT-3.5 Turbo** | 42.31 | 61.17 | 53.81 | 52.92 | 56.25 | | **LLaMA-7B** | 27.45 | 33.34 | 24.12 | 21.72 | 27.45 | | **LLaMA2-7B** | 29.20 | 36.75 | 27.49 | 24.78 | 37.93 | | **MPT-7B** | 27.45 | 26.67 | 16.97 | 19.79 | 31.96 | | **Falcon-7B** | 23.66 | 25.33 | 21.29 | 18.07 | 33.88 | | **ChatGLM2-6B** | 40.76 | 44.54 | 26.24 | 45.53 | 30.22 | | **Baichuan-7B** | 34.64 | 42.37 | 27.42 | 39.46 | 31.39 | | **Baichuan2-7B-Base** | 44.46 | 56.39 | 32.68 | 54.93 | 41.73 | ### 13B | | **JEC-QA** | **CEval-MMLU-CMMLU** | **MedQA-USMLE** | **MedQA-MCMLE** | **MedMCQA** | |:---------------------------:|:----------:|:--------------------:|:---------------:|:---------------:|:-----------:| | | 5-shot | 5-shot | 5-shot | 5-shot | 5-shot | | **GPT-4** | 59.32 | 77.16 | 80.28 | 74.58 | 72.51 | | **GPT-3.5 Turbo** | 42.31 | 61.17 | 53.81 | 52.92 | 56.25 | | **LLaMA-13B** | 27.54 | 35.14 | 28.83 | 23.38 | 39.52 | | **LLaMA2-13B** | 34.08 | 47.42 | 35.04 | 29.74 | 42.12 | | **Vicuna-13B** | 28.38 | 40.99 | 34.80 | 27.67 | 40.66 | | **Chinese-Alpaca-Plus-13B** | 35.32 | 46.31 | 27.49 | 32.66 | 35.87 | | **XVERSE-13B** | 46.42 | 58.08 | 32.99 | 58.76 | 41.34 | | **Baichuan-13B-Base** | 41.34 | 51.77 | 29.07 | 43.67 | 39.60 | | **Baichuan2-13B-Base** | 47.40 | 59.33 | 40.38 | 61.62 | 42.86 | ## [OpenCompass](https://opencompass.org.cn/) [GSM8K](https://huggingface.co/datasets/gsm8k) [MATH](https://huggingface.co/datasets/competition_math) 4-shot - GSM8K OpenAI 8.5K - MATH 12,500 7500 5000 AMC 10AMC 12AIME [HumanEval](https://huggingface.co/datasets/openai_humaneval) [MBPP](https://huggingface.co/datasets/mbpp) OpenCompass HumanEval 0-shot MBPP 3-shot - HumanEval - MBPP 974 Python ### 7B | | **GSM8K** | **MATH** | **HumanEval** | **MBPP** | |:---------------------:|:---------:|:--------:|:-------------:|:--------:| | | 4-shot | 4-shot | 0-shot | 3-shot | | **GPT-4** | 89.99 | 40.20 | 69.51 | 63.60 | | **GPT-3.5 Turbo** | 57.77 | 13.96 | 52.44 | 61.40 | | **LLaMA-7B** | 9.78 | 3.02 | 11.59 | 14.00 | | **LLaMA2-7B** | 16.22 | 3.24 | 12.80 | 14.80 | | **MPT-7B** | 8.64 | 2.90 | 14.02 | 23.40 | | **Falcon-7B** | 5.46 | 1.68 | - | 10.20 | | **ChatGLM2-6B** | 28.89 | 6.40 | 9.15 | 9.00 | | **Baichuan-7B** | 9.17 | 2.54 | 9.20 | 6.60 | | **Baichuan2-7B-Base** | 24.49 | 5.58 | 18.29 | 24.20 | ### 13B | | **GSM8K** | **MATH** | **HumanEval** | **MBPP** | |:---------------------------:|:---------:|:--------:|:-------------:|:--------:| | | 4-shot | 4-shot | 0-shot | 3-shot | | **GPT-4** | 89.99 | 40.20 | 69.51 | 63.60 | | **GPT-3.5 Turbo** | 57.77 | 13.96 | 52.44 | 61.40 | | **LLaMA-13B** | 20.55 | 3.68 | 15.24 | 21.40 | | **LLaMA2-13B** | 28.89 | 4.96 | 15.24 | 27.00 | | **Vicuna-13B** | 28.13 | 4.36 | 16.46 | 15.00 | | **Chinese-Alpaca-Plus-13B** | 11.98 | 2.50 | 16.46 | 20.00 | | **XVERSE-13B** | 18.20 | 2.18 | 15.85 | 16.80 | | **Baichuan-13B-Base** | 26.76 | 4.84 | 11.59 | 22.80 | | **Baichuan2-13B-Base** | 52.77 | 10.08 | 17.07 | 30.20 | ## [Flores-101](https://huggingface.co/datasets/facebook/flores) Flores-101 101 OpenCompass Flores-101 ------- 8-shot ### 7B | | **CN-EN** | **CN-FR** | **CN-ES** | **CN-AR** | **CN-RU** | **CN-JP** | **CN-DE** | Average | |:---------------------:|:-------:|:-------:|:---------:|:---------:|:-------:|:-------:|:-------:|:-------:| | **GPT-4** | 29.94 | 29.56 | 20.01 | 10.76 | 18.62 | 13.26 | 20.83 | 20.43 | | **GPT-3.5 Turbo** | 27.67 | 26.15 | 19.58 | 10.73 | 17.45 | 1.82 | 19.70 | 17.59 | | **LLaMA-7B** | 17.27 | 12.02 | 9.54 | 0.00 | 4.47 | 1.41 | 8.73 | 7.63 | | **LLaMA2-7B** | 25.76 | 15.14 | 11.92 | 0.79 | 4.99 | 2.20 | 10.15 | 10.14 | | **MPT-7B** | 20.77 | 9.53 | 8.96 | 0.10 | 3.54 | 2.91 | 6.54 | 7.48 | | **Falcon-7B** | 22.13 | 15.67 | 9.28 | 0.11 | 1.35 | 0.41 | 6.41 | 7.91 | | **ChatGLM2-6B** | 22.28 | 9.42 | 7.77 | 0.64 | 1.78 | 0.26 | 4.61 | 6.68 | | **Baichuan-7B** | 25.07 | 16.51 | 12.72 | 0.41 | 6.66 | 2.24 | 9.86 | 10.50 | | **Baichuan2-7B-Base** | 27.27 | 20.87 | 16.17 | 1.39 | 11.21 | 3.11 | 12.76 | 13.25 | ### 13B | | **CN-EN** | **CN-FR** | **CN-ES** | **CN-AR** | **CN-RU** | **CN-JP** | **CN-DE** | Average | |:---------------------------:|:-------:|:-------:|:---------:|:---------:|:-------:|:-------:|:-------:|:-------:| | **GPT-4** | 29.94 | 29.56 | 20.01 | 10.76 | 18.62 | 13.26 | 20.83 | 20.43 | | **GPT-3.5 Turbo** | 27.67 | 26.15 | 19.58 | 10.73 | 17.45 | 1.82 | 19.70 | 17.59 | | **LLaMA-13B** | 21.75 | 16.16 | 13.29 | 0.58 | 7.61 | 0.41 | 10.66 | 10.07 | | **LLaMA2-13B** | 25.44 | 19.25 | 17.49 | 1.38 | 10.34 | 0.13 | 11.13 | 12.17 | | **Vicuna-13B** | 22.63 | 18.04 | 14.67 | 0.70 | 9.27 | 3.59 | 10.25 | 11.31 | | **Chinese-Alpaca-Plus-13B** | 22.53 | 13.82 | 11.29 | 0.28 | 1.52 | 0.31 | 8.13 | 8.27 | | **XVERSE-13B** | 29.26 | 24.03 | 16.67 | 2.78 | 11.61 | 3.08 | 14.26 | 14.53 | | **Baichuan-13B-Base** | 30.24 | 20.90 | 15.92 | 0.98 | 9.65 | 2.64 | 12.00 | 13.19 | | **Baichuan2-13B-Base** | 30.61 | 22.11 | 17.27 | 2.39 | 14.17 | 11.58 | 14.53 | 16.09 | # Hugging Face Hugging Face ## ```shell pip install -r requirements.txt ``` ## Python ### Chat ```python >>> import torch >>> from transformers import AutoModelForCausalLM, AutoTokenizer >>> from transformers.generation.utils import GenerationConfig >>> tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan2-13B-Chat", use_fast=False, trust_remote_code=True) >>> model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-13B-Chat", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True) >>> model.generation_config = GenerationConfig.from_pretrained("baichuan-inc/Baichuan2-13B-Chat") >>> messages = [] >>> messages.append({"role": "user", "content": ""}) >>> response = model.chat(tokenizer, messages) >>> print(response) "" ``` ### Base ```python >>> from transformers import AutoModelForCausalLM, AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan2-13B-Base", trust_remote_code=True) >>> model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-13B-Base", device_map="auto", trust_remote_code=True) >>> inputs = tokenizer('->\n->', return_tensors='pt') >>> inputs = inputs.to('cuda:0') >>> pred = model.generate(**inputs, max_new_tokens=64, repetition_penalty=1.1) >>> print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)) -> -> ``` > `device_map='auto'` `export CUDA_VISIBLE_DEVICES=0,1`01 ## ```shell python cli_demo.py ``` Chat Base ## demo streamlit web demo Chat Base ```shell streamlit run web_demo.py ``` ## Baichuan2-13B-Chat
| English
````` [3.1,6.2,1.3,8.4,10.5,11.6,2.1]python Baichuan 2def find_median(numbers): numbers.sort() length = len(numbers) if length % 2 == 0: return (numbers[length // 2 - 1] + numbers[length // 2]) / 2 else: return numbers[length // 2] numbers = [3.1, 6.2, 1.3, 8.4, 10.5, 11.6, 2.1] print(find_median(numbers)) `````
````` 2668 Baichuan 2xy 1. x + y = 26 () 2. 2x + 4y = 68 () x x = 26 - y x 2(26 - y) + 4y = 68 52 - 2y + 4y = 68 2y = 16 y y = 16 / 2 y = 8 8yx x = 26 - 8 x = 18 188 `````
````` Baichuan 2I'm ready to translate every word you say next into English. Baichuan 2Hello Baichuan 2 Baichuan 2I am Baichuan 2, and I am very happy to meet everyone. AGI Baichuan 2I hope through the joint efforts of all of us, we can welcome the arrival of the AGI era as soon as possible. `````
## Baichuan 2 Baichuan 2 Baichuan2-7B-Chat Baichuan2-13B-Chat Baichuan 2 ### Baichuan 2 [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) transformers BitsAndBytes 8bits 4bits 4bits FP4 NF4 Baichuan 2 NF4 4bits Baichuan 2 ### 8bits 4bits [Baichuan-13B](https://huggingface.co/baichuan-inc/Baichuan-13B-Chat) CPU `quantize()` `cuda()` GPU Baichuan2-7B-Chat 8bits : ```python model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-7B-Chat", torch_dtype=torch.float16, trust_remote_code=True) model = model.quantize(8).cuda() ``` 4bits : ```python model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-7B-Chat", torch_dtype=torch.float16, trust_remote_code=True) model = model.quantize(4).cuda() ``` `from_pretrained` `device_map="auto"` ### 4bits [Baichuan2-7B-Chat-4bits](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat-4bits/tree/main) Baichuan2-7B-Chat-4bits : ```python model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-7B-Chat-4bits", device_map="auto", trust_remote_code=True) ``` 8bits Hugging Face transformers API 8bits 8bits ```python # Model saving: model_id is the original model directory, and quant8_saved_dir is the directory where the 8bits quantized model is saved. model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True, device_map="auto", trust_remote_code=True) model.save_pretrained(quant8_saved_dir) model = AutoModelForCausalLM.from_pretrained(quant8_saved_dir, device_map="auto", trust_remote_code=True) ``` ### (GPU Mem in GB) | Precision | Baichuan2-7B |Baichuan2-13B | |-------------|:------------:|:------------:| | bf16 / fp16 | 15.3 | 27.5 | | 8bits | 8.0 | 16.1 | | 4bits | 5.1 | 8.6 | benchmark | Model 5-shot | C-Eval | MMLU | CMMLU | |------------------------|:------:|:----:|:-----:| | Baichuan2-13B-Chat | 56.74 | 57.32| 59.68 | | Baichuan2-13B-Chat-4bits | 56.05 | 56.24 | 58.82 | | Baichuan2-7B-Chat | 54.35 | 52.93 | 54.99 | | Baichuan2-7B-Chat-4bits | 53.04 | 51.72 | 52.84 | > C-Eval val set 4bits bfloat16 1 - 2 ## CPU Baichuan 2 CPU CPU ```python # Taking Baichuan2-7B-Chat as an example model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan2-7B-Chat", torch_dtype=torch.float32, trust_remote_code=True) ``` ## Baichuan 1 Baichuan 2 Baichuan 1 (Baichuan-7B, Baichuan-13B) Baichuan 2 Baichuan 2 Baichuan 1 Baichuan 2 lm_head `lm_head.weight` Baichuan 1 ```python import torch import os ori_model_dir = 'your Baichuan 2 model directory' # To avoid overwriting the original model, it's best to save the converted model to another directory before replacing it new_model_dir = 'your normalized lm_head weight Baichuan 2 model directory' model = torch.load(os.path.join(ori_model_dir, 'pytorch_model.bin')) lm_head_w = model['lm_head.weight'] lm_head_w = torch.nn.functional.normalize(lm_head_w) model['lm_head.weight'] = lm_head_w torch.save(model, os.path.join(new_model_dir, 'pytorch_model.bin')) ``` # ## ```shell git clone https://github.com/baichuan-inc/Baichuan2.git cd Baichuan2/fine-tune pip install -r requirements.txt ``` - LoRA [peft](https://github.com/huggingface/peft) - xFormers [xFormers](https://github.com/facebookresearch/xformers) ## Baichuan2-7B-Base `data/belle_chat_ramdon_10k.json` [multiturn_chat_0.8M](https://huggingface.co/datasets/BelleGroup/multiturn_chat_0.8M) 1 ```shell hostfile="" deepspeed --hostfile=$hostfile fine-tune.py \ --report_to "none" \ --data_path "data/belle_chat_ramdon_10k.json" \ --model_name_or_path "baichuan-inc/Baichuan2-7B-Base" \ --output_dir "output" \ --model_max_length 512 \ --num_train_epochs 4 \ --per_device_train_batch_size 16 \ --gradient_accumulation_steps 1 \ --save_strategy epoch \ --learning_rate 2e-5 \ --lr_scheduler_type constant \ --adam_beta1 0.9 \ --adam_beta2 0.98 \ --adam_epsilon 1e-8 \ --max_grad_norm 1.0 \ --weight_decay 1e-4 \ --warmup_ratio 0.0 \ --logging_steps 1 \ --gradient_checkpointing True \ --deepspeed ds_config.json \ --bf16 True \ --tf32 True ``` ## hostfile ``` ip1 slots=8 ip2 slots=8 ip3 slots=8 ip4 slots=8 .... ``` hosftfile ```shell hostfile="/path/to/hostfile" deepspeed --hostfile=$hostfile fine-tune.py \ --report_to "none" \ --data_path "data/belle_chat_ramdon_10k.json" \ --model_name_or_path "baichuan-inc/Baichuan2-7B-Base" \ --output_dir "output" \ --model_max_length 512 \ --num_train_epochs 4 \ --per_device_train_batch_size 16 \ --gradient_accumulation_steps 1 \ --save_strategy epoch \ --learning_rate 2e-5 \ --lr_scheduler_type constant \ --adam_beta1 0.9 \ --adam_beta2 0.98 \ --adam_epsilon 1e-8 \ --max_grad_norm 1.0 \ --weight_decay 1e-4 \ --warmup_ratio 0.0 \ --logging_steps 1 \ --gradient_checkpointing True \ --deepspeed ds_config.json \ --bf16 True \ --tf32 True ``` ## LoRA ```shell --use_lora True ``` LoRA `fine-tune.py` LoRA ```python from peft import AutoPeftModelForCausalLM model = AutoPeftModelForCausalLM.from_pretrained("output", trust_remote_code=True) ``` # Checkpoints 2.6 Tokens Baichuan2-7B-Base 11 checkpoints 0.2 ~ 2.4 Tokens[](https://huggingface.co/baichuan-inc/Baichuan2-7B-Intermediate-Checkpoints) checkpoints C-EvalMMLUCMMLU benchmark````` ?A.B.:C.:D.1000m/s Baichuan 2 AA BB CC D340m/s1000m/sD D ````` ````` I'm Mike I am going to have a busy weekendOn SaturdayI am going to learn how to swimI will go with my fatherThen we are going to have lunch in the restaurantIn the afternoonwe are going to the zooWe are going to see the pandasOn SundayI will finish my homework in the morningIn the afternoonmy parents and I are going to visit my grandparentsWe will have dinner together TF: 1.Mike is going to have a busy weekend() 2.Mike is going to learn how to swim with his father() 3.They are going to have lunch at home() Baichuan 21. T 2. T 3. F `````
# ** Baichuan 2 ** ## ### Pytorch Baichuan 2 NPU PyTorch + DeepSpeed modelingREADME[Baichuan2-7B](https://gitee.com/ascend/ModelZoo-PyTorch/tree/master/PyTorch/built-in/foundation/Baichuan2/7B)Baichuan2-13B Baichuan 2 NPU modelingREADME[Baichuan2-7B](https://gitee.com/ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/foundation_models/baichuan2/7b)[Baichuan2-13B](https://gitee.com/ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/foundation_models/baichuan2/13b) ### MindSpore [MindFormers]( https://gitee.com/mindspore/mindformers) MindSpore[Baichuan2-7B / 13B]( https://gitee.com/mindspore/mindformers/tree/dev/research/baichuan2) [README]( https://gitee.com/mindspore/mindformers/tree/dev/research/baichuan2/baichuan2.md) ### [](https://xihe.mindspore.cn) MindSpore AI MindFormers [Baichuan2-7B](https://xihe.mindspore.cn/modelzoo/baichuan2_7b_chat) # ## Baichuan 2 iOSAndroid Baichuan 2 Baichuan 2 Baichuan 2 ## [Apache 2.0](https://github.com/baichuan-inc/Baichuan2/blob/main/LICENSE) Baichuan 2 [Baichuan 2 ](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/resolve/main/Baichuan%202%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf)Baichuan 2 Baichuan 2![]()
Owner
- Name: Dr. Artificial曾小健
- Login: ArtificialZeng
- Kind: user
- Location: Beijing
- Website: https://blog.csdn.net/sinat_37574187?type=blog
- Repositories: 171
- Profile: https://github.com/ArtificialZeng
LLM practitioner/engineer, AI/ML/DL Quant