https://github.com/adithya-s-k/mole

Mixture of Lora Experts

https://github.com/adithya-s-k/mole

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Mixture of Lora Experts

Basic Info
  • Host: GitHub
  • Owner: adithya-s-k
  • License: gpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 21.5 KB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License

README.md

MoLE(Mixture of Lora Experts)

MoLE is a novel approach to fine-tuning large language models (LLMs) for multiple tasks simultaneously, leveraging the concept of specialized LoRA adapters that dynamically adapt the base model's behavior based on task requirements. MoLE is designed to enhance the versatility and performance of pre-trained LLMs by incorporating task-specific/language-specific adapters. These adapters are automatically selected and merged with the base model during inference, guided by a task classifier trained during the fine-tuning process.

MoLE Architecture

Key Components

Base Model

The MoLE architecture starts with a pre-trained base LLM (such as Llama | Mistal | Gemma) that serves as a foundation for all tasks. This base model captures general language understanding and can be fine-tuned for specific tasks.

Lora Adapters

Lora adapters are specialized modules tailored for individual tasks. Each Lora adapter encapsulates task-specific knowledge and is designed to seamlessly integrate with the base model to enhance its capabilities for the given task.

Task Classifier

During the fine-tuning process, MoLE trains a task classifier that determines the most appropriate Lora adapter for a given input. This classifier learns to identify task categories and selects the corresponding Lora adapter to apply at runtime.

Workflow

  1. Fine-tuning: The base model is fine-tuned on a diverse set of tasks.
  2. Lora Selection: A task classifier is trained concurrently to predict which Lora adapter to use for each task.
  3. Inference: During inference, the task classifier identifies the task category of the input, and MoLE seamlessly integrates the corresponding Lora adapter with the base model to generate task-specific outputs.

Benefits

  • Task Specialization: MoLE enables the base model to adapt dynamically to diverse tasks without catastrophic forgetting.
  • Improved Performance: By leveraging task-specific Lora adapters, MoLE achieves enhanced performance across multiple tasks.
  • Scalability: The modular design of MoLE allows for easy integration of new tasks through the addition of custom Lora adapters.

Getting Started

To use MoLE for your own tasks, follow these steps:

  1. Prepare Data: Organize your dataset with labeled examples for each task.
  2. Fine-tuning: Fine-tune the base LLM using MoLE, specifying the tasks of interest.
  3. Integration: Implement task-specific Lora adapters and train the task classifier.
  4. Inference: Deploy MoLE for inference, where it automatically selects and applies the appropriate Lora adapter based on input tasks.

Installation

Clone the repo: git clone https://github.com/adithya-s-k/MoLE cd MoLE Create a virtual environment using virtualenv or conda depending on your preferences. We require Python 3.10 or above: conda create -n mole-venv python=3.10 && conda activate mole-venv Install the dependencies. For the default installation, you just need: pip install . If you want to push your results to the Hugging Face Hub, don't forget to add your access token to the environment variable HUGGINGFACEHUB_TOKEN. You can do this by running: huggingface-cli login

Training Parameters

```yaml

model arguments

basemodel: tokenisemodel: model_type:

classifer arguments

bertclassifier: true embeddingclassifier: true

tasks arguments

tasks: nameoftask1: datasetname: datasetsubset: datasetsplit: promptformate: numepochs: 1 steps: nameoftask2: datasetname: datasetsubset: datasetsplit: promptformate: numepochs: 1 steps:

lora arguments

adapter: qlora #either lora or qlora lorar: 32 loraalpha: 16 loradropout: 0.05 loratargetlinear: true loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj

training arguments

gradientaccumulationsteps: 4 microbatchsize: 2 optimizer: adamwbnb8bit lrscheduler: cosine learningrate: 0.0002

wandb arguments to track training

wandbproject: wandbentity: wandbwatch: wandbname: wandblogmodel:

```

Contributing

We welcome contributions to MoLE! If you have ideas for improvements or would like to extend MoLE with new features, please open an issue or submit a pull request on our GitHub repository.

Owner

  • Name: Adithya S K
  • Login: adithya-s-k
  • Kind: user
  • Location: Indian
  • Company: Cognitivelab

Exploring Generative AI • Google DSC Lead'23 • Cloud & Full Stack Engineer • Drones & IoT • FOSS Contributor

GitHub Events

Total
  • Watch event: 9
Last Year
  • Watch event: 9

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

pyproject.toml pypi
  • GitPython >=3.1.41
  • aenum ==3.1.15
  • colorama *
  • datasets >=2.14.0
  • huggingface_hub >=0.22.0
  • nltk ==3.8.1
  • protobuf ==3.20.*
  • pycountry *
  • pytablewriter *
  • rouge_score ==0.1.2
  • sacrebleu *
  • scikit-learn *
  • sentencepiece >=0.1.99
  • spacy ==3.7.2
  • termcolor ==2.3.0
  • torch >=2.0
  • transformers >=4.38.0
setup.py pypi