https://github.com/adithya-s-k/mole

Mixture of Lora Experts

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Mixture of Lora Experts

Basic Info

Host: GitHub
Owner: adithya-s-k
License: gpl-3.0
Language: Python
Default Branch: main
Homepage:
Size: 21.5 KB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License

MoLE(Mixture of Lora Experts)

MoLE is a novel approach to fine-tuning large language models (LLMs) for multiple tasks simultaneously, leveraging the concept of specialized LoRA adapters that dynamically adapt the base model's behavior based on task requirements. MoLE is designed to enhance the versatility and performance of pre-trained LLMs by incorporating task-specific/language-specific adapters. These adapters are automatically selected and merged with the base model during inference, guided by a task classifier trained during the fine-tuning process.

Key Components

Base Model

The MoLE architecture starts with a pre-trained base LLM (such as Llama | Mistal | Gemma) that serves as a foundation for all tasks. This base model captures general language understanding and can be fine-tuned for specific tasks.

Lora Adapters

Lora adapters are specialized modules tailored for individual tasks. Each Lora adapter encapsulates task-specific knowledge and is designed to seamlessly integrate with the base model to enhance its capabilities for the given task.

Task Classifier

During the fine-tuning process, MoLE trains a task classifier that determines the most appropriate Lora adapter for a given input. This classifier learns to identify task categories and selects the corresponding Lora adapter to apply at runtime.

Workflow

Fine-tuning: The base model is fine-tuned on a diverse set of tasks.
Lora Selection: A task classifier is trained concurrently to predict which Lora adapter to use for each task.
Inference: During inference, the task classifier identifies the task category of the input, and MoLE seamlessly integrates the corresponding Lora adapter with the base model to generate task-specific outputs.

Benefits

Task Specialization: MoLE enables the base model to adapt dynamically to diverse tasks without catastrophic forgetting.
Improved Performance: By leveraging task-specific Lora adapters, MoLE achieves enhanced performance across multiple tasks.
Scalability: The modular design of MoLE allows for easy integration of new tasks through the addition of custom Lora adapters.

Getting Started

To use MoLE for your own tasks, follow these steps:

Prepare Data: Organize your dataset with labeled examples for each task.
Fine-tuning: Fine-tune the base LLM using MoLE, specifying the tasks of interest.
Integration: Implement task-specific Lora adapters and train the task classifier.
Inference: Deploy MoLE for inference, where it automatically selects and applies the appropriate Lora adapter based on input tasks.

Installation

Clone the repo: git clone https://github.com/adithya-s-k/MoLE cd MoLE Create a virtual environment using virtualenv or conda depending on your preferences. We require Python 3.10 or above: conda create -n mole-venv python=3.10 && conda activate mole-venv Install the dependencies. For the default installation, you just need: pip install . If you want to push your results to the Hugging Face Hub, don't forget to add your access token to the environment variable HUGGINGFACEHUB_TOKEN. You can do this by running: huggingface-cli login

Training Parameters

```yaml

model arguments

basemodel: tokenisemodel: model_type:

classifer arguments

bertclassifier: true embeddingclassifier: true

tasks arguments

tasks: nameoftask1: datasetname: datasetsubset: datasetsplit: promptformate: numepochs: 1 steps: nameoftask2: datasetname: datasetsubset: datasetsplit: promptformate: numepochs: 1 steps:

lora arguments

adapter: qlora #either lora or qlora lorar: 32 loraalpha: 16 loradropout: 0.05 loratargetlinear: true loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj

training arguments

gradientaccumulationsteps: 4 microbatchsize: 2 optimizer: adamwbnb8bit lrscheduler: cosine learningrate: 0.0002

wandb arguments to track training

wandbproject: wandbentity: wandbwatch: wandbname: wandblogmodel:

```

Contributing

We welcome contributions to MoLE! If you have ideas for improvements or would like to extend MoLE with new features, please open an issue or submit a pull request on our GitHub repository.

Owner

Name: Adithya S K
Login: adithya-s-k
Kind: user
Location: Indian
Company: Cognitivelab

Website: https://adithyask.com/
Twitter: adithya_s_k
Repositories: 60
Profile: https://github.com/adithya-s-k

Exploring Generative AI • Google DSC Lead'23 • Cloud & Full Stack Engineer • Drones & IoT • FOSS Contributor

GitHub Events

Total

Watch event: 9

Last Year

Watch event: 9

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

pyproject.toml pypi

GitPython >=3.1.41
aenum ==3.1.15
colorama *
datasets >=2.14.0
huggingface_hub >=0.22.0
nltk ==3.8.1
protobuf ==3.20.*
pycountry *
pytablewriter *
rouge_score ==0.1.2
sacrebleu *
scikit-learn *
sentencepiece >=0.1.99
spacy ==3.7.2
termcolor ==2.3.0
torch >=2.0
transformers >=4.38.0

setup.py pypi

https://github.com/adithya-s-k/mole

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

MoLE(Mixture of Lora Experts)

Key Components

Base Model

Lora Adapters

Task Classifier

Workflow

Benefits

Getting Started

Installation

Training Parameters

model arguments

classifer arguments

tasks arguments

lora arguments

training arguments

wandb arguments to track training

Contributing

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies