Recent Releases of chinese-mixtral
chinese-mixtral - 中文Mixtral大模型 v1.2
本次更新添加了仿OpenAI API Demo。教程:https://github.com/ymcui/Chinese-Mixtral/wiki/openaiapizh
This release adds OpenAI API Demo. Tutorial: https://github.com/ymcui/Chinese-Mixtral/wiki/openaiapien
What's Changed
- Add OpenAI API Demo by @ymcui in https://github.com/ymcui/Chinese-Mixtral/pull/25
Full Changelog: https://github.com/ymcui/Chinese-Mixtral/compare/v1.1...v1.2
- Python
Published by ymcui almost 2 years ago
chinese-mixtral - 中文Mixtral大模型 v1.1
本次更新主要有以下两点:
添加中文Mixtral技术报告,介绍了模型训练方法和相关实验分析
- 论文地址:https://arxiv.org/abs/2403.01851
添加了预训练和指令精调训练脚本
- 预训练:https://github.com/ymcui/Chinese-Mixtral/wiki/ptscriptszh
- 指令精调:https://github.com/ymcui/Chinese-Mixtral/wiki/sftscriptszh
What's Changed
- Add eval scripts by @iMountTai in https://github.com/ymcui/Chinese-Mixtral/pull/5
- Update readme and add requirements by @iMountTai in https://github.com/ymcui/Chinese-Mixtral/pull/7
- llama.cpp: add IQ3_XXS quantization models by @ymcui in https://github.com/ymcui/Chinese-Mixtral/pull/8
- Add training scripts by @iMountTai in https://github.com/ymcui/Chinese-Mixtral/pull/18
- Add Chinese Mixtral paper by @ymcui in https://github.com/ymcui/Chinese-Mixtral/pull/20
Full Changelog: https://github.com/ymcui/Chinese-Mixtral/compare/v1.0...v1.1
- Python
Published by ymcui almost 2 years ago
chinese-mixtral - 中文Mixtral大模型 v1.0
发布中文Mixtral, Mixtral-Instruct大模型已正式发布。 - Chinese-Mixtral:基座模型,使用20G语料增量训练 - Chinese-Mixtral-Instruct:指令/chat模型,在Chinese-Mixtral的基础上进一步通过指令精调(500万条指令)获得
模型特点
📖 稀疏混合专家模型
Mixtral是一个稀疏混合专家模型。该模型与以往的LLaMA等主流大模型结构具有显著差异,主要体现在以下几点:
- 每个FFN层包含8个不同的"专家"(全连接层),根据门控值选取最优的2个进行激活
- 输入序列中的每个token都会独立地选取专家,而不是整个序列对应一组专家
- 实际参数量约为46.7B,在推理时激活的参数量约为13B
🚄 原生支持32K上下文(实测支持128K)
Mixtral模型原生支持32K上下文(实测可达128K)。用户可使用单一模型来解决不同长度的各类任务。
模型效果
- 大模型竞技场:http://llm-arena.ymcui.com/
- 生成效果:https://github.com/ymcui/Chinese-Mixtral#生成效果评测
- 客观效果:https://github.com/ymcui/Chinese-Mixtral#客观效果评测
- Python
Published by ymcui about 2 years ago