Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.3%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: shanxiaojun
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 1.77 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
MINT
Quick Start
We fine-tuning Qwen2-VL with LLaMA-Factory.
bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Reduandancy.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Synergy.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Uniqueness.yaml
Configure your environment based on the instruction below.
Provided Datasets
Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
bash
pip install --upgrade huggingface_hub
huggingface-cli login
Requirement
| Mandatory | Minimum | Recommend | | ------------ | ------- | --------- | | python | 3.8 | 3.11 | | torch | 1.13.1 | 2.4.0 | | transformers | 4.41.2 | 4.43.4 | | datasets | 2.16.0 | 2.20.0 | | accelerate | 0.30.1 | 0.32.0 | | peft | 0.11.1 | 0.12.0 | | trl | 0.8.6 | 0.9.6 |
| Optional | Minimum | Recommend | | ------------ | ------- | --------- | | CUDA | 11.6 | 12.2 | | deepspeed | 0.10.0 | 0.14.0 | | bitsandbytes | 0.39.0 | 0.43.1 | | vllm | 0.4.3 | 0.5.0 | | flash-attn | 2.3.0 | 2.6.3 |
Hardware Requirement
* estimated
| Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B | | ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ | | Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB | | Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB | | Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB | | LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB | | QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB | | QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB | | QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
Getting Started
Installation
[!IMPORTANT] Installation is mandatory.
bash
cd MINT
pip install -e ".[torch,metrics]"
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality
[!TIP] Use
pip install --no-deps -e .to resolve package conflicts.
For Windows users
If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you need to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels) based on your CUDA version. ```bash pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl ``` To enable FlashAttention-2 on the Windows platform, you need to install the precompiled `flash-attn` library, which supports CUDA 12.1 to 12.2. Please download the corresponding version from [flash-attention](https://github.com/bdashore3/flash-attention/releases) based on your requirements.For Ascend NPU users
To install LLaMA Factory on Ascend NPU devices, please specify extra dependencies: `pip install -e ".[torch-npu,metrics]"`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands: ```bash # replace the url according to your CANN version and devices # install CANN Toolkit wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run bash Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run --install # install CANN Kernels wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run bash Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run --install # set env variables source /usr/local/Ascend/ascend-toolkit/set_env.sh ``` | Requirement | Minimum | Recommend | | ------------ | ------- | ----------- | | CANN | 8.0.RC1 | 8.0.RC1 | | torch | 2.1.0 | 2.1.0 | | torch-npu | 2.1.0 | 2.1.0.post3 | | deepspeed | 0.13.2 | 0.13.2 | Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use. If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations. Download the pre-built Docker images: [32GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) | [64GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)Data Preparation
Please refer to data/README.md for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
[!NOTE] Please update
data/dataset_info.jsonto use your custom dataset.[!TIP] Use
llamafactory-cli helpto show help information.
Fine-Tuning with LLaMA Board GUI (powered by Gradio)
bash
llamafactory-cli webui
Build Docker
For CUDA users:
bash
cd docker/docker-cuda/
docker compose up -d
docker compose exec llamafactory bash
For Ascend NPU users:
bash
cd docker/docker-npu/
docker compose up -d
docker compose exec llamafactory bash
For AMD ROCm users:
bash
cd docker/docker-rocm/
docker compose up -d
docker compose exec llamafactory bash
Build without Docker Compose
For CUDA users: ```bash docker build -f ./docker/docker-cuda/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg INSTALL_FLASHATTN=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . docker run -dit --gpus=all \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -p 7860:7860 \ -p 8000:8000 \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ``` For Ascend NPU users: ```bash # Choose docker image upon your environment docker build -f ./docker/docker-npu/Dockerfile \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . # Change `device` upon your resources docker run -dit \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /etc/ascend_install.info:/etc/ascend_install.info \ -p 7860:7860 \ -p 8000:8000 \ --device /dev/davinci0 \ --device /dev/davinci_manager \ --device /dev/devmm_svm \ --device /dev/hisi_hdc \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ``` For AMD ROCm users: ```bash docker build -f ./docker/docker-rocm/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg INSTALL_FLASHATTN=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . docker run -dit \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -v ./saves:/app/saves \ -p 7860:7860 \ -p 8000:8000 \ --device /dev/kfd \ --device /dev/dri \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ```Details about volume
- `hf_cache`: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory. - `ms_cache`: Similar to Hugging Face cache but for ModelScope users. - `om_cache`: Similar to Hugging Face cache but for Modelers users. - `data`: Place datasets on this dir of the host machine so that they can be selected on LLaMA Board GUI. - `output`: Set export dir to this location so that the merged result can be accessed directly on the host machine.Deploy with OpenAI-style API and vLLM
bash
API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
[!TIP] Visit this page for API document.
Examples: Image understanding | Function calling
Download from ModelScope Hub
If you have trouble with downloading models and datasets from Hugging Face, you can use ModelScope.
bash
export USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows
Train the model by specifying a model ID of the ModelScope Hub as the model_name_or_path. You can find a full list of model IDs at ModelScope Hub, e.g., LLM-Research/Meta-Llama-3-8B-Instruct.
Download from Modelers Hub
You can also use Modelers Hub to download models and datasets.
bash
export USE_OPENMIND_HUB=1 # `set USE_OPENMIND_HUB=1` for Windows
Train the model by specifying a model ID of the Modelers Hub as the model_name_or_path. You can find a full list of model IDs at Modelers Hub, e.g., TeleAI/TeleChat-7B-pt.
Use W&B Logger
To use Weights & Biases for logging experimental results, you need to add the following arguments to yaml files.
yaml
report_to: wandb
run_name: test_run # optional
Set WANDB_API_KEY to your key when launching training tasks to log in with your W&B account.
License
This repository is licensed under the Apache-2.0 License.
Please follow the model licenses to use the corresponding model weights: Qwen
Acknowledgement
This repo benefits from LLaMA-Factory and Qwen2-VL. Thanks for their wonderful works.
Owner
- Name: ideashan
- Login: shanxiaojun
- Kind: user
- Company: University of Electronic Science and Technology of China
- Repositories: 1
- Profile: https://github.com/shanxiaojun
An undergraduate student of Univeristy of Electronic Science and Technology of China
Citation (CITATION.cff)
cff-version: 1.2.0
date-released: 2024-03
message: "If you use this software, please cite it as below."
authors:
- family-names: "Zheng"
given-names: "Yaowei"
- family-names: "Zhang"
given-names: "Richong"
- family-names: "Zhang"
given-names: "Junhao"
- family-names: "Ye"
given-names: "Yanhan"
- family-names: "Luo"
given-names: "Zheyan"
- family-names: "Feng"
given-names: "Zhangchi"
- family-names: "Ma"
given-names: "Yongqiang"
title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
url: "https://arxiv.org/abs/2403.13372"
preferred-citation:
type: conference-paper
conference:
name: "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)"
authors:
- family-names: "Zheng"
given-names: "Yaowei"
- family-names: "Zhang"
given-names: "Richong"
- family-names: "Zhang"
given-names: "Junhao"
- family-names: "Ye"
given-names: "Yanhan"
- family-names: "Luo"
given-names: "Zheyan"
- family-names: "Feng"
given-names: "Zhangchi"
- family-names: "Ma"
given-names: "Yongqiang"
title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
url: "https://arxiv.org/abs/2403.13372"
year: 2024
publisher: "Association for Computational Linguistics"
address: "Bangkok, Thailand"
GitHub Events
Total
- Push event: 2
- Create event: 1
Last Year
- Push event: 2
- Create event: 1
Dependencies
- ${BASE_IMAGE} latest build
- ascendai/cann 8.0.rc1-910b-ubuntu22.04-py3.8 build
- hardandheavy/transformers-rocm 2.2.0 build
- accelerate >=0.34.0,<=1.0.1
- av *
- datasets >=2.16.0,<=3.1.0
- einops *
- fastapi *
- fire *
- gradio >=4.0.0,<5.0.0
- matplotlib >=3.7.0
- numpy <2.0.0
- packaging *
- pandas >=2.0.0
- peft >=0.11.1,<=0.12.0
- protobuf *
- pydantic *
- pyyaml *
- scipy *
- sentencepiece *
- sse-starlette *
- tiktoken *
- transformers >=4.41.2,<=4.46.1
- trl >=0.8.6,<=0.9.6
- uvicorn *
- accelerate <=1.0.1,>=0.34.0
- adam-mini *
- aqlm >=1.1.0
- auto-gptq >=0.5.0
- autoawq *
- av *
- badam >=1.2.1
- bitsandbytes >=0.39.0
- datasets <=3.1.0,>=2.16.0
- decorator *
- deepspeed <=0.14.4,>=0.10.0
- eetq *
- einops *
- fastapi *
- fire *
- galore-torch *
- gradio <5.0.0,>=4.0.0
- hqq *
- jieba *
- liger-kernel *
- matplotlib >=3.7.0
- modelscope *
- nltk *
- numpy <2.0.0
- openmind *
- optimum >=1.17.0
- packaging *
- pandas >=2.0.0
- peft <=0.12.0,>=0.11.1
- pre-commit *
- protobuf *
- pydantic *
- pytest *
- pyyaml *
- rouge-chinese *
- ruff *
- scipy *
- sentencepiece *
- sse-starlette *
- tiktoken *
- torch ==2.1.0
- torch >=1.13.1
- torch-npu ==2.1.0.post3
- transformers <=4.46.1,>=4.41.2
- transformers_stream_generator *
- trl <=0.9.6,>=0.8.6
- uvicorn *
- vllm <0.6.4,>=0.4.3