https://github.com/artificialzeng/chatglm2-6b

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

https://github.com/artificialzeng/chatglm2-6b

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (3.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Basic Info
  • Host: GitHub
  • Owner: ArtificialZeng
  • License: other
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 4.91 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of THUDM/ChatGLM2-6B
Created almost 3 years ago · Last pushed almost 3 years ago

https://github.com/ArtificialZeng/ChatGLM2-6B/blob/main/

# ChatGLM2-6B

HF Repo Twitter [GLM@ACL 22] [GitHub] [GLM-130B@ICLR 23] [GitHub]

Slack WeChat

*Read this in [English](README_EN.md)* ## ChatGLM**2**-6B [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) ChatGLM**2**-6B 1. **** ChatGLM ChatGLM2-6B ChatGLM2-6B [GLM](https://github.com/THUDM/GLM) 1.4T [](#)ChatGLM2-6B MMLU+23%CEval+33%GSM8K+571% BBH+60% 2. **** [FlashAttention](https://github.com/HazyResearch/flash-attention) Context Length ChatGLM-6B 2K 32K 8K ChatGLM2-6B 3. **** [Multi-Query Attention](http://arxiv.org/abs/1911.02150) ChatGLM2-6B 42%INT4 6G 1K 8K 4. ****ChatGLM2-6B ******** ChatGLM3 ----- ChatGLM2-6B [](MODEL_LICENSE)** ChatGLM2-6B iOS Windows App ** ChatGLM2-6B **** ## ChatGLM2-6B [MMLU](https://github.com/hendrycks/test) ()[C-Eval](https://cevalbenchmark.com/static/leaderboard.html)[GSM8K](https://github.com/openai/grade-school-math)[BBH](https://github.com/suzgunmirac/BIG-Bench-Hard) [evaluation](./evaluation/README.md) C-Eval ### MMLU | Model | Average | STEM | Social Sciences | Humanities | Others | | ----- | ----- | ---- | ----- | ----- | ----- | | ChatGLM-6B | 40.63 | 33.89 | 44.84 | 39.02 | 45.71 | | ChatGLM2-6B (base) | 47.86 | 41.20 | 54.44 | 43.66 | 54.46 | | ChatGLM2-6B | 45.46 | 40.06 | 51.61 | 41.23 | 51.24 | > Chat zero-shot CoT (Chain-of-Thought) Base few-shot answer-only ### C-Eval | Model | Average | STEM | Social Sciences | Humanities | Others | | ----- | ---- | ---- | ----- | ----- | ----- | | ChatGLM-6B | 38.9 | 33.3 | 48.3 | 41.3 | 38.0 | | ChatGLM2-6B (base) | 51.7 | 48.6 | 60.5 | 51.3 | 49.8 | | ChatGLM2-6B | 50.1 | 46.4 | 60.4 | 50.6 | 46.9 | > Chat zero-shot CoT Base few-shot answer only ### GSM8K | Model | Accuracy | Accuracy (Chinese)* | | ----- | ----- | ----- | | ChatGLM-6B | 4.82 | 5.85 | | ChatGLM2-6B (base) | 32.37 | 28.95 | | ChatGLM2-6B | 28.05 | 20.45 | > few-shot CoT CoT prompt http://arxiv.org/abs/2201.11903 > > \* API GSM8K 500 CoT prompt ### BBH | Model | Accuracy | | ----- | ----- | | ChatGLM-6B | 18.73 | | ChatGLM2-6B (base) | 33.68 | | ChatGLM2-6B | 30.00 | > few-shot CoT CoT prompt https://github.com/suzgunmirac/BIG-Bench-Hard/tree/main/cot-prompts ## ChatGLM2-6B [Multi-Query Attention](http://arxiv.org/abs/1911.02150) 2000 | Model | (/) | | ---- | ----- | | ChatGLM-6B | 31.49 | | ChatGLM2-6B | 44.62 | > batch size = 1max length = 2048bf16 A100-SXM4-80G PyTorch 2.0.1 Multi-Query Attention KV Cache ChatGLM2-6B Causal Mask KV Cache 6GB INT4 ChatGLM-6B 1119 ChatGLM2-6B 8192 | **** | ** 2048 ** | ** 8192 ** | | -------------- |---------------------|---------------------| | FP16 / BF16 | 13.1 GB | 12.8 GB | | INT8 | 8.2 GB | 8.1 GB | | INT4 | 5.5 GB | 5.1 GB | > ChatGLM2-6B PyTorch 2.0 `torch.nn.functional.scaled_dot_product_attention` Attention PyTorch fallback Attention | | Accuracy (MMLU) | Accuracy (C-Eval dev) | | ----- | ----- |-----------------------| | BF16 | 45.47 | 53.57 | | INT4 | 43.13 | 50.30 | ## ChatGLM2-6B ChatGLM2-6B ChatGLM2-6B
![](resources/math.png)
![](resources/knowledge.png)
![](resources/long-context.png)
## ### ```shell git clone https://github.com/THUDM/ChatGLM2-6B cd ChatGLM2-6B ``` pip `pip install -r requirements.txt` `transformers` `4.30.2``torch` 2.0 ### ChatGLM2-6B ```python >>> from transformers import AutoTokenizer, AutoModel >>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True) >>> model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='cuda') >>> model = model.eval() >>> response, history = model.chat(tokenizer, "", history=[]) >>> print(response) ! ChatGLM2-6B,, >>> response, history = model.chat(tokenizer, "", history=history) >>> print(response) ,: 1. :,, 2. :,,, 3. :,,,,, 4. :,,, 5. :,,, 6. :,,,, ,, ``` #### `transformers` [Hugging Face Hub](https://huggingface.co/THUDM/chatglm2-6b) Hugging Face Hub [Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) ```Shell git clone https://huggingface.co/THUDM/chatglm2-6b ``` Hugging Face Hub checkpoint ```Shell GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm2-6b ``` [](https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/) `chatglm2-6b` `THUDM/chatglm2-6b` `chatglm2-6b` `from_pretrained` `revision="v1.0"` `v1.0` [Change Log](https://huggingface.co/THUDM/chatglm2-6b#change-log) ### Demo ![web-demo](resources/web-demo.gif) Gradio`pip install gradio` [web_demo.py](web_demo.py) ```shell python web_demo.py ``` Web Server > `share=False` `share=True` > [@AdamBear](https://github.com/AdamBear) Streamlit Demo `web_demo2.py` ```shell pip install streamlit streamlit-chat ``` ```shell streamlit run web_demo2.py ``` prompt Streamlit Demo ### Demo ![cli-demo](resources/cli-demo.png) [cli_demo.py](cli_demo.py) ```shell python cli_demo.py ``` `clear` `stop` ### API `pip install fastapi uvicorn` [api.py](api.py) ```shell python api.py ``` 8000 POST ```shell curl -X POST "http://127.0.0.1:8000" \ -H 'Content-Type: application/json' \ -d '{"prompt": "", "history": []}' ``` ```shell { "response":" ChatGLM2-6B", "history":[[""," ChatGLM2-6B"]], "status":200, "time":"2023-03-23 21:38:40" } ``` [@hiyouga]() OpenAI API ChatGPT [ChatGPT-Next-Web](https://github.com/Yidadaa/ChatGPT-Next-Web)[openai_api.py](openai_api.py) ```shell python openai_api.py ``` API ```python import openai if __name__ == "__main__": openai.api_base = "http://localhost:8000/v1" openai.api_key = "none" for chunk in openai.ChatCompletion.create( model="chatglm2-6b", messages=[ {"role": "user", "content": ""} ], stream=True ): if hasattr(chunk.choices[0].delta, "content"): print(chunk.choices[0].delta.content, end="", flush=True) ``` ## ### FP16 13GB GPU ```python # 4/8 bit model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).quantize(8).cuda() ``` ChatGLM2-6B 4-bit ```python model = AutoModel.from_pretrained("THUDM/chatglm2-6b-int4",trust_remote_code=True).cuda() ``` ### CPU GPU CPU 32GB ```python model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).float() ``` ```python model = AutoModel.from_pretrained("THUDM/chatglm2-6b-int4",trust_remote_code=True).float() ``` cpu `gcc` `openmp` Linux Windows [TDM-GCC](https://jmeubank.github.io/tdm-gcc/) `openmp` Windows `gcc` `TDM-GCC 10.3.0` Linux `gcc 11.3.0` MacOS [Q1](FAQ.md#q1) ### Mac Apple Silicon AMD GPU Mac MPS GPU ChatGLM2-6B Apple [](https://developer.apple.com/metal/pytorch) PyTorch-Nightly2.x.x.dev2023xxxx 2.x.x MacOS [](README.md#) mps ```python model = AutoModel.from_pretrained("your local path", trust_remote_code=True).to('mps') ``` ChatGLM2-6B 13GB 16GB MacBook Pro chatglm2-6b-int4 GPU kernel CUDA MacOS CPU CPU [ OpenMP](FAQ.md#q1) ### GPU GPU GPU accelerate: `pip install accelerate` ```python from utils import load_model_on_gpus model = load_model_on_gpus("THUDM/chatglm2-6b", num_gpus=2) ``` GPU `num_gpus` GPU `device_map` ## [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) ChatGLM2-6B [Model License](MODEL_LICENSE)ChatGLM2-6B ******** ChatGLM3 [yiwen.xu@zhipuai.cn](mailto:yiwen.xu@zhipuai.cn) ## ChatGLM2-6B ``` @article{zeng2022glm, title={Glm-130b: An open bilingual pre-trained model}, author={Zeng, Aohan and Liu, Xiao and Du, Zhengxiao and Wang, Zihan and Lai, Hanyu and Ding, Ming and Yang, Zhuoyi and Xu, Yifan and Zheng, Wendi and Xia, Xiao and others}, journal={arXiv preprint arXiv:2210.02414}, year={2022} } ``` ``` @inproceedings{du2022glm, title={GLM: General Language Model Pretraining with Autoregressive Blank Infilling}, author={Du, Zhengxiao and Qian, Yujie and Liu, Xiao and Ding, Ming and Qiu, Jiezhong and Yang, Zhilin and Tang, Jie}, booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)}, pages={320--335}, year={2022} } ```

Owner

  • Name: Dr. Artificial曾小健
  • Login: ArtificialZeng
  • Kind: user
  • Location: Beijing

LLM practitioner/engineer, AI/ML/DL Quant

GitHub Events

Total
Last Year