Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: shanxiaojun
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 1.77 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 10 months ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

MINT

Quick Start

We fine-tuning Qwen2-VL with LLaMA-Factory. bash CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Reduandancy.yaml CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Synergy.yaml CUDA_VISIBLE_DEVICES=0 llamafactory-cli train config/qwen_vl_Uniqueness.yaml

Configure your environment based on the instruction below.

Provided Datasets

Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.

bash pip install --upgrade huggingface_hub huggingface-cli login

Requirement

| Mandatory | Minimum | Recommend | | ------------ | ------- | --------- | | python | 3.8 | 3.11 | | torch | 1.13.1 | 2.4.0 | | transformers | 4.41.2 | 4.43.4 | | datasets | 2.16.0 | 2.20.0 | | accelerate | 0.30.1 | 0.32.0 | | peft | 0.11.1 | 0.12.0 | | trl | 0.8.6 | 0.9.6 |

| Optional | Minimum | Recommend | | ------------ | ------- | --------- | | CUDA | 11.6 | 12.2 | | deepspeed | 0.10.0 | 0.14.0 | | bitsandbytes | 0.39.0 | 0.43.1 | | vllm | 0.4.3 | 0.5.0 | | flash-attn | 2.3.0 | 2.6.3 |

Hardware Requirement

* estimated

| Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B | | ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ | | Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB | | Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB | | Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB | | LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB | | QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB | | QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB | | QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |

Getting Started

Installation

[!IMPORTANT] Installation is mandatory.

bash cd MINT pip install -e ".[torch,metrics]"

Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality

[!TIP] Use pip install --no-deps -e . to resolve package conflicts.

For Windows users If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you need to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels) based on your CUDA version. ```bash pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl ``` To enable FlashAttention-2 on the Windows platform, you need to install the precompiled `flash-attn` library, which supports CUDA 12.1 to 12.2. Please download the corresponding version from [flash-attention](https://github.com/bdashore3/flash-attention/releases) based on your requirements.
For Ascend NPU users To install LLaMA Factory on Ascend NPU devices, please specify extra dependencies: `pip install -e ".[torch-npu,metrics]"`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands: ```bash # replace the url according to your CANN version and devices # install CANN Toolkit wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run bash Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run --install # install CANN Kernels wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run bash Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run --install # set env variables source /usr/local/Ascend/ascend-toolkit/set_env.sh ``` | Requirement | Minimum | Recommend | | ------------ | ------- | ----------- | | CANN | 8.0.RC1 | 8.0.RC1 | | torch | 2.1.0 | 2.1.0 | | torch-npu | 2.1.0 | 2.1.0.post3 | | deepspeed | 0.13.2 | 0.13.2 | Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use. If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations. Download the pre-built Docker images: [32GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) | [64GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)

Data Preparation

Please refer to data/README.md for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.

[!NOTE] Please update data/dataset_info.json to use your custom dataset.

[!TIP] Use llamafactory-cli help to show help information.

Fine-Tuning with LLaMA Board GUI (powered by Gradio)

bash llamafactory-cli webui

Build Docker

For CUDA users:

bash cd docker/docker-cuda/ docker compose up -d docker compose exec llamafactory bash

For Ascend NPU users:

bash cd docker/docker-npu/ docker compose up -d docker compose exec llamafactory bash

For AMD ROCm users:

bash cd docker/docker-rocm/ docker compose up -d docker compose exec llamafactory bash

Build without Docker Compose For CUDA users: ```bash docker build -f ./docker/docker-cuda/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg INSTALL_FLASHATTN=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . docker run -dit --gpus=all \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -p 7860:7860 \ -p 8000:8000 \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ``` For Ascend NPU users: ```bash # Choose docker image upon your environment docker build -f ./docker/docker-npu/Dockerfile \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . # Change `device` upon your resources docker run -dit \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /etc/ascend_install.info:/etc/ascend_install.info \ -p 7860:7860 \ -p 8000:8000 \ --device /dev/davinci0 \ --device /dev/davinci_manager \ --device /dev/devmm_svm \ --device /dev/hisi_hdc \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ``` For AMD ROCm users: ```bash docker build -f ./docker/docker-rocm/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg INSTALL_FLASHATTN=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . docker run -dit \ -v ./hf_cache:/root/.cache/huggingface \ -v ./ms_cache:/root/.cache/modelscope \ -v ./om_cache:/root/.cache/openmind \ -v ./data:/app/data \ -v ./output:/app/output \ -v ./saves:/app/saves \ -p 7860:7860 \ -p 8000:8000 \ --device /dev/kfd \ --device /dev/dri \ --shm-size 16G \ --name llamafactory \ llamafactory:latest docker exec -it llamafactory bash ```
Details about volume - `hf_cache`: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory. - `ms_cache`: Similar to Hugging Face cache but for ModelScope users. - `om_cache`: Similar to Hugging Face cache but for Modelers users. - `data`: Place datasets on this dir of the host machine so that they can be selected on LLaMA Board GUI. - `output`: Set export dir to this location so that the merged result can be accessed directly on the host machine.

Deploy with OpenAI-style API and vLLM

bash API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml

[!TIP] Visit this page for API document.

Examples: Image understanding | Function calling

Download from ModelScope Hub

If you have trouble with downloading models and datasets from Hugging Face, you can use ModelScope.

bash export USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows

Train the model by specifying a model ID of the ModelScope Hub as the model_name_or_path. You can find a full list of model IDs at ModelScope Hub, e.g., LLM-Research/Meta-Llama-3-8B-Instruct.

Download from Modelers Hub

You can also use Modelers Hub to download models and datasets.

bash export USE_OPENMIND_HUB=1 # `set USE_OPENMIND_HUB=1` for Windows

Train the model by specifying a model ID of the Modelers Hub as the model_name_or_path. You can find a full list of model IDs at Modelers Hub, e.g., TeleAI/TeleChat-7B-pt.

Use W&B Logger

To use Weights & Biases for logging experimental results, you need to add the following arguments to yaml files.

yaml report_to: wandb run_name: test_run # optional

Set WANDB_API_KEY to your key when launching training tasks to log in with your W&B account.

License

This repository is licensed under the Apache-2.0 License.

Please follow the model licenses to use the corresponding model weights: Qwen

Acknowledgement

This repo benefits from LLaMA-Factory and Qwen2-VL. Thanks for their wonderful works.

Owner

  • Name: ideashan
  • Login: shanxiaojun
  • Kind: user
  • Company: University of Electronic Science and Technology of China

An undergraduate student of Univeristy of Electronic Science and Technology of China

Citation (CITATION.cff)

cff-version: 1.2.0
date-released: 2024-03
message: "If you use this software, please cite it as below."
authors:
- family-names: "Zheng"
  given-names: "Yaowei"
- family-names: "Zhang"
  given-names: "Richong"
- family-names: "Zhang"
  given-names: "Junhao"
- family-names: "Ye"
  given-names: "Yanhan"
- family-names: "Luo"
  given-names: "Zheyan"
- family-names: "Feng"
  given-names: "Zhangchi"
- family-names: "Ma"
  given-names: "Yongqiang"
title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
url: "https://arxiv.org/abs/2403.13372"
preferred-citation:
  type: conference-paper
  conference:
    name: "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)"
  authors:
    - family-names: "Zheng"
      given-names: "Yaowei"
    - family-names: "Zhang"
      given-names: "Richong"
    - family-names: "Zhang"
      given-names: "Junhao"
    - family-names: "Ye"
      given-names: "Yanhan"
    - family-names: "Luo"
      given-names: "Zheyan"
    - family-names: "Feng"
      given-names: "Zhangchi"
    - family-names: "Ma"
      given-names: "Yongqiang"
  title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
  url: "https://arxiv.org/abs/2403.13372"
  year: 2024
  publisher: "Association for Computational Linguistics"
  address: "Bangkok, Thailand"

GitHub Events

Total
  • Push event: 2
  • Create event: 1
Last Year
  • Push event: 2
  • Create event: 1

Dependencies

docker/docker-cuda/Dockerfile docker
  • ${BASE_IMAGE} latest build
docker/docker-cuda/docker-compose.yml docker
docker/docker-npu/Dockerfile docker
  • ascendai/cann 8.0.rc1-910b-ubuntu22.04-py3.8 build
docker/docker-npu/docker-compose.yml docker
docker/docker-rocm/Dockerfile docker
  • hardandheavy/transformers-rocm 2.2.0 build
docker/docker-rocm/docker-compose.yml docker
pyproject.toml pypi
requirements.txt pypi
  • accelerate >=0.34.0,<=1.0.1
  • av *
  • datasets >=2.16.0,<=3.1.0
  • einops *
  • fastapi *
  • fire *
  • gradio >=4.0.0,<5.0.0
  • matplotlib >=3.7.0
  • numpy <2.0.0
  • packaging *
  • pandas >=2.0.0
  • peft >=0.11.1,<=0.12.0
  • protobuf *
  • pydantic *
  • pyyaml *
  • scipy *
  • sentencepiece *
  • sse-starlette *
  • tiktoken *
  • transformers >=4.41.2,<=4.46.1
  • trl >=0.8.6,<=0.9.6
  • uvicorn *
setup.py pypi
src/llamafactory.egg-info/requires.txt pypi
  • accelerate <=1.0.1,>=0.34.0
  • adam-mini *
  • aqlm >=1.1.0
  • auto-gptq >=0.5.0
  • autoawq *
  • av *
  • badam >=1.2.1
  • bitsandbytes >=0.39.0
  • datasets <=3.1.0,>=2.16.0
  • decorator *
  • deepspeed <=0.14.4,>=0.10.0
  • eetq *
  • einops *
  • fastapi *
  • fire *
  • galore-torch *
  • gradio <5.0.0,>=4.0.0
  • hqq *
  • jieba *
  • liger-kernel *
  • matplotlib >=3.7.0
  • modelscope *
  • nltk *
  • numpy <2.0.0
  • openmind *
  • optimum >=1.17.0
  • packaging *
  • pandas >=2.0.0
  • peft <=0.12.0,>=0.11.1
  • pre-commit *
  • protobuf *
  • pydantic *
  • pytest *
  • pyyaml *
  • rouge-chinese *
  • ruff *
  • scipy *
  • sentencepiece *
  • sse-starlette *
  • tiktoken *
  • torch ==2.1.0
  • torch >=1.13.1
  • torch-npu ==2.1.0.post3
  • transformers <=4.46.1,>=4.41.2
  • transformers_stream_generator *
  • trl <=0.9.6,>=0.8.6
  • uvicorn *
  • vllm <0.6.4,>=0.4.3