maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 14 committers (7.1%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary

Keywords

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision qwen2-vl transformers vision-and-language vqa

Keywords from Contributors

exoplanet energy-system cryptocurrencies mesh hydrology speaker-diarization speech-enhancement speech-recognition speech-separation speech-synthesis

Scientific Fields

Mathematics Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation ·

Repository

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Basic Info

Host: GitHub
Owner: roboflow
License: apache-2.0
Language: Python
Default Branch: develop
Homepage: https://maestro.roboflow.com
Size: 10.6 MB

Statistics

Stars: 2,630
Watchers: 34
Forks: 217
Open Issues: 28
Releases: 2

Topics

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision qwen2-vl transformers vision-and-language vqa

Created about 2 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Citation

maestro

[![version](https://badge.fury.io/py/maestro.svg)](https://badge.fury.io/py/maestro) [![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/maestro/blob/develop/cookbooks/maestro_qwen2_5_vl_json_extraction.ipynb) [![discord](https://img.shields.io/discord/1159501506232451173?logo=discord&label=discord&labelColor=fff&color=5865f2&link=https%3A%2F%2Fdiscord.gg%2FGbfgXGJ8Bk)](https://discord.gg/GbfgXGJ8Bk)

Hello

maestro is a streamlined tool to accelerate the fine-tuning of multimodal models. By encapsulating best practices from our core modules, maestro handles configuration, data loading, reproducibility, and training loop setup. It currently offers ready-to-use recipes for popular vision-language models such as Florence-2, PaliGemma 2, and Qwen2.5-VL.

Fine-tune VLMs for free

| model, task and acceleration | open in colab | |:------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| | Florence-2 (0.9B) object detection with LoRA (experimental) | | | PaliGemma 2 (3B) JSON data extraction with LoRA | | | Qwen2.5-VL (3B) JSON data extraction with QLoRA | | | Qwen2.5-VL (7B) object detection with QLoRA (experimental) | |

News

2025/02/05 (1.0.0): This release introduces support for Florence-2, PaliGemma 2, and Qwen2.5-VL and includes LoRA, QLoRA, and graph freezing to keep hardware requirements in check. It offers a single CLI/SDK to reduce code complexity, and a consistent JSONL format to streamline data handling.

Quickstart

Install

To begin, install the model-specific dependencies. Since some models may have clashing requirements, we recommend creating a dedicated Python environment for each model.

bash pip install "maestro[paligemma_2]"

CLI

Kick off fine-tuning with our command-line interface, which leverages the configuration and training routines defined in each model’s core module. Simply specify key parameters such as the dataset location, number of epochs, batch size, optimization strategy, and metrics.

bash maestro paligemma_2 train \ --dataset "dataset/location" \ --epochs 10 \ --batch-size 4 \ --optimization_strategy "qlora" \ --metrics "edit_distance"

Python

For greater control, use the Python API to fine-tune your models. Import the train function from the corresponding module and define your configuration in a dictionary. The core modules take care of reproducibility, data preparation, and training setup.

```python from maestro.trainer.models.paligemma_2.core import train

config = { "dataset": "dataset/location", "epochs": 10, "batchsize": 4, "optimizationstrategy": "qlora", "metrics": ["edit_distance"] }

train(config) ```

Contribution

We appreciate your input as we continue refining Maestro. Your feedback is invaluable in guiding our improvements. To learn how you can help, please check out our Contributing Guide. If you have any questions or ideas, feel free to start a conversation in our GitHub Discussions. Thank you for being a part of our journey!

Owner

Name: Roboflow
Login: roboflow
Kind: organization
Email: hello@roboflow.com
Location: United States of America

Website: https://roboflow.com
Twitter: roboflow
Repositories: 142
Profile: https://github.com/roboflow

Citation (CITATION.cff)

# This CITATION.cff file has been updated to reflect the current focus of maestro.
# If you use maestro for vision-language model fine-tuning, please cite it using the metadata provided below.

cff-version: 1.2.0
title: maestro
message: >-
  If you use maestro for vision-language model fine-tuning, please cite it using the metadata provided below.
type: software
authors:
  - given-names: Roboflow
    email: support@roboflow.com
repository-code: 'https://github.com/roboflow/maestro'
url: 'https://roboflow.github.io/maestro/'
abstract: >-
  maestro is a streamlined tool for fine-tuning vision-language models.
  It encapsulates best practices for configuration, data handling, reproducibility,
  and training loop setup, offering ready-to-use recipes for popular models such as
  Florence-2, PaliGemma 2, and Qwen2.5-VL.
keywords:
  - vision-language models
  - fine-tuning
  - multimodal
  - deep learning
license: MIT

GitHub Events

Total

Create event: 76
Commit comment event: 1
Release event: 1
Issues event: 35
Watch event: 1,159
Delete event: 43
Issue comment event: 116
Push event: 289
Pull request review comment event: 15
Pull request review event: 55
Pull request event: 167
Fork event: 113

Last Year

Create event: 76
Commit comment event: 1
Release event: 1
Issues event: 35
Watch event: 1,159
Delete event: 43
Issue comment event: 116
Push event: 289
Pull request review comment event: 15
Pull request review event: 55
Pull request event: 167
Fork event: 113

Committers

Last synced: 9 months ago

All Time

Total Commits: 375
Total Committers: 14
Avg Commits per committer: 26.786
Development Distribution Score (DDS): 0.488

Past Year

Commits: 319
Committers: 9
Avg Commits per committer: 35.444
Development Distribution Score (DDS): 0.539

Top Committers

Name	Email	Commits
SkalskiP	p**2@g**m	192
pre-commit-ci[bot]	6****]	54
Onuralp SEZER	t**r@g**m	42
dependabot[bot]	4****]	37
Paweł Pęczek	p**l@r**m	27
Shataxi Dubey	d**0@g**m	9
James Gallagher	j**g@j**g	7
cnukaus	c**s@g**m	1
Wheelspawn	n**e@p**m	1
Pedro Gengo	p**o@h**m	1
Matvezy	m**v@t**u	1
Joseph Nelson	j**2@g**m	1
Ikko Eltociear Ashimine	e**r@g**m	1
Deependu Jha	d**1@g**m	1

Committer Domains (Top 20 + Academic)

trinity.edu: 1 jamesg.blog: 1 roboflow.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 29
Total pull requests: 242
Average time to close issues: 8 days
Average time to close pull requests: 11 days
Total issue authors: 25
Total pull request authors: 22
Average comments per issue: 0.76
Average comments per pull request: 0.49
Merged pull requests: 179
Bot issues: 1
Bot pull requests: 137

Past Year

Issues: 29
Pull requests: 224
Average time to close issues: 8 days
Average time to close pull requests: 6 days
Issue authors: 25
Pull request authors: 14
Average comments per issue: 0.76
Average comments per pull request: 0.42
Merged pull requests: 167
Bot issues: 1
Bot pull requests: 135

View more stats

Top Authors

Issue Authors

SyedMohammedSameer (3)
dsbyprateekg (2)
SkalskiP (2)
TashinAhmed (1)
David-19940718 (1)
kanpuriyanawab (1)
nupoor-ka (1)
grzegorzj (1)
DP1701 (1)
dependabot[bot] (1)
fire (1)
antoan (1)
cnukaus (1)
hardikdava (1)
patel-zeel (1)

Pull Request Authors

dependabot[bot] (96)
pre-commit-ci[bot] (49)
SkalskiP (47)
onuralpszr (28)
AshAnand34 (4)
capjamesg (4)
Matvezy (2)
0xD4rky (2)
AlexBodner (2)
cnukaus (2)
AmazingK2k3 (2)
shataxiDubey (2)
itsPreto (1)
pedrogengo (1)
lab176344 (1)

Top Labels

Issue Labels

question (12) enhancement (9) bug (6) duplicate (1) dependencies (1) python (1)

Pull Request Labels

dependencies (100) python (91) enhancement (8) documentation (7) bug (3) github_actions (3) ci:github-actions (1)

Dependencies

poetry.lock pypi

contourpy 1.1.1
cycler 0.12.1
fonttools 4.45.1
importlib-resources 6.1.1
kiwisolver 1.4.5
matplotlib 3.7.4
numpy 1.24.4
opencv-python-headless 4.8.1.78
packaging 23.2
pillow 10.1.0
pyparsing 3.1.1
python-dateutil 2.8.2
pyyaml 6.0.1
scipy 1.10.1
six 1.16.0
supervision 0.17.0rc3
zipp 3.17.0

pyproject.toml pypi

python >=3.8,<3.12.0
requests ^2.31.0
supervision ^0.17.0rc4
transformers ^4.35.2

.github/workflows/docs.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/welcome.yml actions

actions/first-interaction v1.2.0 composite

maestro

Science Score: 54.0%

Keywords

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

maestro

Hello

Fine-tune VLMs for free

News

Quickstart

Install

CLI

Python

Contribution

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies