mlora-cli

An Efficient "Factory" to Build Multiple LoRA Adapters

https://github.com/tudb-labs/mlora

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 18 committers (5.6%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.0%) to scientific vocabulary

Keywords

baichuan chatglm dpo finetune gpu llama llama2 llm lora mlora peft rlhf

Last synced: 6 months ago · JSON representation ·

Repository

An Efficient "Factory" to Build Multiple LoRA Adapters

Basic Info

Host: GitHub
Owner: TUDB-Labs
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 11 MB

Statistics

Stars: 337
Watchers: 3
Forks: 61
Open Issues: 13
Releases: 1

Topics

baichuan chatglm dpo finetune gpu llama llama2 llm lora mlora peft rlhf

Created over 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation Security

mLoRA

An Efficient "Factory" to Build Multiple LoRA Adapters

mLoRA (a.k.a Multi-LoRA Fine-Tune) is an open-source framework designed for efficient fine-tuning of multiple Large Language Models (LLMs) using LoRA and its variants. Key features of mLoRA include:

Concurrent fine-tuning of multiple LoRA adapters.
Shared base model among multiple LoRA adapters.
Efficient pipeline parallelism algorithm.
Support for multiple LoRA variant algorithms and various base models.
Support for multiple reinforcement learning preference alignment algorithms.

The end-to-end architecture of the mLoRA is shown in the figure:

Latest News

[2025/01] mLoRA has been accepted by VLDB'25

Quickstart

Firstly, you should clone this repository and install dependencies (or use our image): ```bash

Clone Repository

git clone https://github.com/TUDB-Labs/mLoRA cd mLoRA

Install requirements need the Python >= 3.12

pip install . ```

The mlora_train.py code is a starting point for batch fine-tuning LoRA adapters. bash python mlora_train.py \ --base_model TinyLlama/TinyLlama-1.1B-Chat-v0.4 \ --config demo/lora/lora_case_1.yaml

You can check the adapters' configuration in demo folder, there are some configuration regarding the use of different LoRA variants and reinforcement learning preference alignment algorithms.

For further detailed usage information, please use --help option: bash python mlora_train.py --help

Deployment using pipeline parallelism

Similar to Quickstart, the command to start in a two-node environment is as follows:

NOTE1: Use environment variables MASTER_ADDR/MASTER_PORT to set the master node.

NOTE2: Set balance, indicating the number of decoder layers allocated to each rank.

```bash

in the first node

export MASTERADDR=master.svc.cluster.local export MASTERPORT=12355 python mlorapptrain.py \ --basemodel TinyLlama/TinyLlama-1.1B-Chat-v0.4 \ --config demo/lora/loracase_1.yaml \ --pipeline \ --device "cuda:0" \ --rank 0 \ --balance 12 13 \ --no-recompute \ --precision fp32

in the second node

export MASTERADDR=master.svc.cluster.local export MASTERPORT=12355 python mlorapptrain.py \ --basemodel TinyLlama/TinyLlama-1.1B-Chat-v0.4 \ --config demo/lora/loracase_1.yaml \ --pipeline \ --device "cuda:1" \ --rank 1 \ --balance 12 13 \ --no-recompute \ --precision fp32 ```

Quickstart with Docker

mLoRA offers an official Docker image for quick start and development, The image is available on Dockerhub Packages registry.

First, you should pull the latest image (the image also use for development): bash docker pull yezhengmaolove/mlora:latest Deploy and enter a container to run mLoRA: ```bash docker run -itd --runtime nvidia --gpus all \ -v ~/yourdatasetdir:/dataset \ -v ~/yourmodeldir:/model \ -p :22 \ --name mlora \ yezhengmaolove/mlora:latest

when the container started, use the ssh to login

the default password is mlora@123

ssh root@localhost -p

pull the latest code and run the mlora

cd /mLoRA git pull python mloratrain.py \ --basemodel TinyLlama/TinyLlama-1.1B-Chat-v0.4 \ --config demo/lora/loracase1.yaml ```

Deploy as service with Docker

We can deploy mLoAR as a service to continuously receive user requests and perform fine-tuning task.

First, you should pull the latest image (use same image for deploy):

bash docker pull yezhengmaolove/mlora:latest

Deploy our mLoRA server: bash docker run -itd --runtime nvidia --gpus all \ -v ~/your_dataset_cache_dir:/cache \ -v ~/your_model_dir:/model \ -p <host_port>:8000 \ --name mlora_server \ -e "BASE_MODEL=TinyLlama/TinyLlama-1.1B-Chat-v0.4" \ -e "STORAGE_DIR=/cache" \ yezhengmaolove/mlora:latest /bin/bash /opt/deploy.sh

Once the service is deployed, install and use mlora_cli.py to interact with the server.

```bash

install the client tools

pip install mlora-cli

use the mlora cli tool to connect to mlora server

mloracli (mLoRA) set port <hostport> (mLoRA) set host http://

and enjoy it!!

```

Step-by-step

### Step1. Download the mlora image and install the mlora_cli ```bash docker pull yezhengmaolove/mlora:latest pip install mlora-cli ``` [![asciicast](https://asciinema.org/a/TfYrIsXgfZeMxPRrzOkWZ7T4b.svg)](https://asciinema.org/a/TfYrIsXgfZeMxPRrzOkWZ7T4b) ### Step2. Start the mlora server with Docker ```bash # first, we create a cache dir in host for cache some file mkdir ~/cache # second, we manually download the model weights from Hugging Face. mkdir ~/model && cd ~/model git clone https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 # we map port 8000 used by the mlora server to port 1288 on the host machine. # the BASE_MODEL environment variable indicates the path of the base model used by mlora. # the STORAGE_DIR environment variable indicates the path where datasets and lora adapters are stored. # we use the script /opt/deploy.sh in container to start the server. docker run -itd --runtime nvidia --gpus all \ -v ~/cache:/cache \ -v ~/model:/model \ -p 1288:8000 \ --name mlora_server \ -e "BASE_MODEL=/model/TinyLlama-1.1B-Chat-v1.0" \ -e "STORAGE_DIR=/cache" \ yezhengmaolove/mlora:latest /bin/bash /opt/deploy.sh ``` [![asciicast](https://asciinema.org/a/LrLH0jU176NQNfawHpCITaLGx.svg)](https://asciinema.org/a/LrLH0jU176NQNfawHpCITaLGx) ### Step3. use mlora_cli tool link to mlora server we use mlora_cli link to the server http://127.0.0.1:1288 (must use the http protocal) ```bash (mLoRA) set port 1288 (mLoRA) set host http://127.0.0.1 ``` [![asciicast](https://asciinema.org/a/GN1NBc2MEN8GrmcIasmIMDjNa.svg)](https://asciinema.org/a/GN1NBc2MEN8GrmcIasmIMDjNa) ### Step4. upload some data file for train. we use the Stanford Alpaca dataset as a demo, the data just like below: ```json [{"instruction": "", "input": "", "output": }, {...}] ``` ```bash (mLoRA) file upload ? file type: train data ? name: alpaca ? file path: /home/yezhengmao/alpaca-lora/alpaca_data.json ``` [![asciicast](https://asciinema.org/a/KN41mnlMShZWDs3dIrd64L4nS.svg)](https://asciinema.org/a/KN41mnlMShZWDs3dIrd64L4nS) ### Step5. upload some template to provide a structured format for generating prompts the template in a yaml file, and write by templating language Jinja2, see the demo/prompt.yaml file the data file you upload can be considered as array data, with the elements in the array being of dictionary type. we consider each element as a data point in the template. ```bash (mLoRA) file upload ? file type: prompt template ? name: simple_prompt ? file path: /home/yezhengmao/mLoRA/demo/prompt.yaml ``` [![asciicast](https://asciinema.org/a/SFY8H0K4DppVvqmQCVuOuThoz.svg)](https://asciinema.org/a/SFY8H0K4DppVvqmQCVuOuThoz) ### Step6. create a dataset we create a dataset, the dataset consists of data, a template, and the corresponding prompter. we can use `dataset showcase` command to check the if the prompts are generated correctly. ```bash (mLoRA) dataset create ? name: alpaca_dataset ? train data file: alpaca ? prompt template file: simple_prompt ? prompter: instruction ? data preprocessing: default (mLoRA) dataset showcase ? dataset name: alpaca_dataset ``` [![asciicast](https://asciinema.org/a/mxpwo6gWihjEEsJ0dfcXG98cM.svg)](https://asciinema.org/a/mxpwo6gWihjEEsJ0dfcXG98cM) ### Step7. create a adapter now we can use `adapter create` command to create a adapter for train. [![asciicast](https://asciinema.org/a/Wf4PHfoGC0PCcciGHOXCkX0xj.svg)](https://asciinema.org/a/Wf4PHfoGC0PCcciGHOXCkX0xj) ### Step8. !!!! submit task to train !!!! Finally, we can submit the task to train our adapter using the defined dataset. NOTE: you can continuously submit or terminal training tasks. use the `adapter ls` or `task ls` to check the tasks' status [![asciicast](https://asciinema.org/a/vr8f1XtA0CBULGIP81w2bakkE.svg)](https://asciinema.org/a/vr8f1XtA0CBULGIP81w2bakkE)

Why you should use mLoRA

Using mLoRA can save significant computational and memory resources when training multiple adapters simultaneously.

High performance on consumer hardware

We fine-tuned multiple LoRA adapters using four A6000 graphics cards with fp32 precision and without using checkpointing and any quantization techniques:

| Model | mLoRA (tokens/s) | PEFT-LoRA with FSDP (tokens/s) | PEFT-LoRA with TP (tokens/s) | | ------------------------------------------------------------------- | ---------------- | ------------------------------ | ---------------------------- | | llama-2-7b (32fp) | 2364 | 1750 | 1500 | | llama-2-13b (32fp) | 1280 | OOM | 875 |

Supported model

| | Model | | ------- | -------------------------------- | | ✓ | LLaMA |

Supported LoRA variants

| | Variant | | ------- | --------------------------------------------------- | | ✓ | QLoRA,NIPS,2023 | | ✓ | LoRA+,ICML,2024 | | ✓ | VeRA,ICLR,2024 | | ✓ | DoRA,ICML,2024 |

Supported preference alignment algorithms

| | Variant | | ------- | ---------------------------------------------------- | | ✓ | DPO,NeurIPS,2024 | | ✓ | CPO,ICML,2024 | | ✓ | CIT,arXiv,2024 |

Document

Contributing

We welcome contributions to improve this repository! Please review the contribution guidelines before submitting pull requests or issues.

Fork the repository. Create a new branch for your feature or fix. Submit a pull request with a detailed explanation of your changes.

You can use the pre-commit to check your code. ```bash

Install requirements

pip install .[ci_test] ln -s ../../.github/workflows/pre-commit .git/hooks/pre-commit Or just call the script to check your codebash .github/workflows/pre-commit ```

Citation

Please cite the repo if you use the code in this repo. bibtex @misc{ye2024mlorafinetuningloraadapters, title={mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs}, author={Zhengmao Ye and Dengchun Li and Zetao Hu and Tingfeng Lan and Jian Sha and Sicong Zhang and Lei Duan and Jie Zuo and Hui Lu and Yuanchun Zhou and Mingjie Tang}, year={2024}, eprint={2312.02515}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2312.02515}, }

Copyright

This project is licensed under the Apache 2.0 License.

``` Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ```

Owner

Name: Tudb Lab
Login: TUDB-Labs
Kind: organization
Email: zhang.yi@tudb.ai

Repositories: 2
Profile: https://github.com/TUDB-Labs

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use Multi-LoRA, please cite it as below."
authors:
- family-names: "Ye"
  given-names: "Zhengmao"
  orcid: "https://orcid.org/0009-0003-6179-4641"
- family-names: "Li"
  given-names: "Dengchun"
  orcid: "https://orcid.org/0000-0001-6002-1661"
- family-names: "Lan"
  given-names: "Tingfeng"
  orcid: "https://orcid.org/0009-0002-7219-9031"
- family-names: "Tang"
  given-names: "Mingjie"
  orcid: "https://orcid.org/0000-0002-8893-4574"
title: "Multi-LoRA"
version: 0.1
date-released: 2023-10-05
url: "https://github.com/TUDB-Labs/multi-lora-fine-tune"

GitHub Events

Total

Issues event: 12
Watch event: 76
Issue comment event: 39
Push event: 8
Pull request review comment event: 21
Pull request review event: 14
Pull request event: 36
Fork event: 18
Create event: 1

Last Year

Issues event: 12
Watch event: 76
Issue comment event: 39
Push event: 8
Pull request review comment event: 21
Pull request review event: 14
Pull request event: 36
Fork event: 18
Create event: 1

Committers

Last synced: 6 months ago

All Time

Total Commits: 291
Total Committers: 18
Avg Commits per committer: 16.167
Development Distribution Score (DDS): 0.619

Past Year

Commits: 14
Committers: 10
Avg Commits per committer: 1.4
Development Distribution Score (DDS): 0.714

Top Committers

Name	Email	Commits
yezhengmao	y**e@g**m	111
Michael Lee	m**e@1**m	72
Jingqi Tian	j**n@g**m	38
Mingjie Tang	t**1@g**m	21
liam gao	5**3@q**m	16
vinkle	v**t@o**m	8
Yang Shuyun	6****t	5
mingjie tang	m**e@m**n	4
斑马	4****u	4
qingsongcai	8****d	3
Tingfeng Lan	t**n@o**m	2
FortuneBush	1****h	1
HuangLe	l**g@b**n	1
Imane Lakehal	s**7@g**m	1
Jiangjun Wang	1****n	1
LongzhuoWang	w**9@1**m	1
Saf9933	1****3	1
ck-gyj	2**5@q**m	1

Committer Domains (Top 20 + Academic)

qq.com: 2 163.com: 2 bupt.edu.cn: 1 mingjies-air.lan: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 37
Total pull requests: 116
Average time to close issues: 3 months
Average time to close pull requests: 3 days
Total issue authors: 14
Total pull request authors: 20
Average comments per issue: 3.22
Average comments per pull request: 0.7
Merged pull requests: 80
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 7
Pull requests: 34
Average time to close issues: 7 days
Average time to close pull requests: 6 days
Issue authors: 6
Pull request authors: 14
Average comments per issue: 3.71
Average comments per pull request: 0.65
Merged pull requests: 12
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

merlintang (15)
yezhengmao1 (7)
LianxinGao (5)
ChaoGaoUCR (2)
FortuneBush (2)
cainiaogoroad (2)
EricLabile (1)
Imanelakehal (1)
waitfor-night (1)
Z-Zili (1)
joker666666666 (1)

Pull Request Authors

yezhengmao1 (42)
mikecovlee (24)
LianxinGao (14)
cainiaogoroad (13)
Saf9933 (11)
yezhem (8)
junshijun (8)
FortuneBush (8)
waitfor-night (7)
ck-gyj (6)
merlintang (6)
Imanelakehal (4)
LongzhuoWang (2)
amaankhan02 (2)
Pherenice1125 (2)

Top Labels

Issue Labels

documentation (6) enhancement (3) good first issue (3)

Pull Request Labels

help wanted (2) bug (1) documentation (1) enhancement (1)

Packages

Total packages: 1
Total downloads:
- pypi 12 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 6
Total maintainers: 1

pypi.org: mlora-cli

The cli tools for mLoRA system.

Homepage: https://github.com/TUDB-Labs/mLoRA
Documentation: https://mlora-cli.readthedocs.io/
License: Apache Software License
Latest release: 0.2.4
published over 1 year ago

Versions: 6
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 12 Last month

Rankings

Dependent packages count: 10.7%

Average: 35.5%

Dependent repos count: 60.2%

Maintainers (1)

yezhengmaolove

Last synced: 6 months ago

Dependencies

.github/workflows/python-test-dev.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite

.github/workflows/python-test-main.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite

.github/workflows/run_on_gpu.yml actions

pyproject.toml pypi

bitsandbytes *
einops *
sentencepiece *
torch ==2.0.1
transformers *
xformers *

requirements.txt pypi

accelerate ==0.21.0
bitsandbytes ==0.40.0
datasets *
einops ==0.6.1
jieba *
nltk *
rouge *
rouge_chinese *
scipy ==1.10.1
sentencepiece ==0.1.99
torch ==2.0.1
transformers ==4.31.0
xformers ==0.0.20

mlora-cli

Science Score: 64.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

mLoRA

An Efficient "Factory" to Build Multiple LoRA Adapters

Latest News

Quickstart

Clone Repository

Install requirements need the Python >= 3.12

Deployment using pipeline parallelism

in the first node

in the second node

Quickstart with Docker

when the container started, use the ssh to login

the default password is mlora@123

pull the latest code and run the mlora

Deploy as service with Docker

install the client tools

use the mlora cli tool to connect to mlora server

and enjoy it!!

Why you should use mLoRA

High performance on consumer hardware

Supported model

Supported LoRA variants

Supported preference alignment algorithms

Document

Contributing

Install requirements

Citation

Copyright

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: mlora-cli

Rankings

Maintainers (1)

Dependencies