composer

Supercharge Your Model Training

https://github.com/mosaicml/composer

Keywords

deep-learning machine-learning ml-efficiency ml-systems ml-training neural-network neural-networks pytorch

Keywords from Contributors

transformer cryptocurrency jax pretrained-models audio vlm speech-recognition qwen pytorch-transformers model-hub

Last synced: 10 months ago · JSON representation

Repository

Supercharge Your Model Training

Basic Info

Host: GitHub
Owner: mosaicml
License: apache-2.0
Language: Python
Default Branch: main
Homepage: http://docs.mosaicml.com
Size: 22 MB

Statistics

Stars: 5,402
Watchers: 49
Forks: 451
Open Issues: 54
Releases: 67

Topics

deep-learning machine-learning ml-efficiency ml-systems ml-training neural-network neural-networks pytorch

Created over 4 years ago · Last pushed 11 months ago

Metadata Files

Readme Contributing License Code of conduct Codeowners

README.md

Supercharge your Model Training

Deep Learning Framework for Training at Scale

[Website] - [Getting Started] - [Docs] - [We're Hiring!]

👋 Welcome

Composer is an open-source deep learning training library by MosaicML. Built on top of PyTorch, the Composer library makes it easier to implement distributed training workflows on large-scale clusters.

We built Composer to be optimized for scalability and usability, integrating best practices for efficient, multi-node training. By abstracting away low-level complexities like parallelism techniques, distributed data loading, and memory optimization, you can focus on training modern ML models and running experiments without slowing down.

We recommend using Composer to speedup your experimentation workflow if you’re training neural networks of any size, including:

Large Language Models (LLMs)
Diffusion models
Embedding models (e.g. BERT)
Transformer-based models
Convolutional Neural Networks (CNNs)

Composer is heavily used by the MosaicML research team to train state-of-the-art models like MPT, and we open-sourced this library to enable the ML community to do the same. This framework is used by organizations in both the tech industry and the academic sphere and is continually updated with new features, bug fixes, and stability improvements for production workloads.

🔑 Key Features

Composer is to give you better workflows with the ability to maximize scale and customizability.

We designed Composer from the ground up for modern deep learning workloads. Gone are the days of AlexNet and ResNet, when state-of-the-art models could be trained on a couple of desktop GPUs. Today, developing the latest and greatest deep learning models often requires cluster-scale hardware — but with Composer’s help, you’ll hardly notice the difference.

The heart of Composer is our Trainer abstraction: a highly optimized PyTorch training loop designed to allow both you and your model to iterate faster. Our trainer has simple ways for you to configure your parallelization scheme, data loaders, metrics, loggers, and more.

Scalability

Whether you’re training on 1 GPU or 512 GPUs, 50MB or 10TB of data - Composer is built to keep your workflow simple.

FSDP: For large models that are too large to fit on GPUs, Composer has integrated PyTorch FullyShardedDataParallelism into our trainer and made it simple to efficiently parallelize custom models. We’ve found FSDP is competitive performance-wise with much more complex parallelism strategies. Alternatively, Composer also supports standard PyTorch distributed data parallelism (DDP) execution.
Elastic sharded checkpointing: Save on eight GPUs, resume on sixteen. Composer supports elastic sharded checkpointing, so you never have to worry if your sharded saved state is compatible with your new hardware setup.
Data streaming: Working with large datasets? Download datasets from cloud blob storage on the fly by integrating with MosaicML StreamingDataset during model training.

Customizability

Other high-level deep learning trainers provide simplicity at the cost of rigidity. When you want to add your own features, their abstractions get in your way. Composer, on the other hand, provides simple ways for you to customize our Trainer to your needs.

Composer’s training loop has a series of events that occur at each stage in the training process.

**Fig. 1:* Composer’s training loop has a series of events that occur at each stage in the training process. Callbacks are functions that users write to run at specific events. For example, our Learning Rate Monitor Callback logs the learning rate at every BATCH_END event.*

Callbacks: Composer’s callback system allows you to insert custom logic at any point in the training loop. We’ve written callbacks to monitor memory usage, log and visualize images, and estimate your model’s remaining training time, to name a few. This feature is popular among researchers who want to implement and experiment with custom training techniques.
Speedup algorithms: We draw from the latest research to create a collection of algorithmic speedups. Stack these speedups into MosaicML recipes to boost your training speeds. Our team has open-sourced the optimal combinations of speedups for different types of models.
- 8x speedup: Stable Diffusion
  - $200k original SD2 cost —> $50k (Blog)
- 7x speedup: ResNet-50 on ImageNet
  - 3h33m —> 25m on 8xA100 (Blog)
- 8.8x speedup: BERT-Base Pretraining
  - 10h —> 1.13h on 8xA100 (Blog)
- 5.4x speedup: DeepLab v3 on ADE20K
  - 3h30m —> 39m on 8xA100 (Blog)

Better workflows

Composer is built to automate away low-level pain points and headaches so you can focus on the important (and fun) parts of deep learning and iterate faster.

Auto-resumption: Failed training run? Have no fear — just re-run your code, and Composer will automatically resume from your latest saved checkpoint.
CUDA OOM Prevention: Say goodbye to out-of-memory errors. Set your microbatch size to “auto”, and Composer will automatically select the biggest one that fits on your GPUs.
Time Abstractions: Ever messed up your conversion between update steps, epochs, samples, and tokens? Specify your training duration with custom units (epochs, batches, samples, and tokens) in your training loop with our Time class.

Integrations

Integrate with the tools you know and love for experiment tracking and data streaming.

Cloud integrations: Our Checkpointing and logging features have first-class support for remote storage and loading from Cloud bucket (OCI, GCP, AWS S3).
*******Experiment tracking:******* Weights and Biases, MLFlow, CometML, and neptune.ai — the choice is yours, easily log your data to your favorite platform.

🚀 Getting Started

📍Prerequisites

Composer is designed for users who are comfortable with Python and have basic familiarity with deep learning fundamentals and PyTorch.

*******************************************Software requirements:******************************************* A recent version of PyTorch.

*******************************************Hardware requirements:******************************************* System with CUDA-compatible GPUs (AMD + RoCM coming soon!). Composer can run on CPUs, but for full benefits, we recommend using it on hardware accelerators.

💾 Installation

Composer can be installed with pip:

bash pip install mosaicml

To simplify the environment setup for Composer, we also provide a set of pre-built Docker images. We highly recommend you use our Docker images.

🏁 Quick Start

Here is a code snippet demonstrating our Trainer on the MNIST dataset.

```python import torch import torch.nn as nn import torch.nn.functional as F from torchvision import datasets, transforms from torch.utils.data import DataLoader

from composer import Trainer from composer.models import ComposerClassifier from composer.algorithms import LabelSmoothing, CutMix, ChannelsLast

class Model(nn.Module): """Toy convolutional neural network architecture in pytorch for MNIST."""

def __init__(self, num_classes: int = 10):
    super().__init__()

    self.num_classes = num_classes

    self.conv1 = nn.Conv2d(1, 16, (3, 3), padding=0)
    self.conv2 = nn.Conv2d(16, 32, (3, 3), padding=0)
    self.bn = nn.BatchNorm2d(32)
    self.fc1 = nn.Linear(32 * 16, 32)
    self.fc2 = nn.Linear(32, num_classes)

def forward(self, x):
    out = self.conv1(x)
    out = F.relu(out)
    out = self.conv2(out)
    out = self.bn(out)
    out = F.relu(out)
    out = F.adaptive_avg_pool2d(out, (4, 4))
    out = torch.flatten(out, 1, -1)
    out = self.fc1(out)
    out = F.relu(out)
    return self.fc2(out)

transform = transforms.Compose([transforms.ToTensor()]) dataset = datasets.MNIST("data", train=True, download=True, transform=transform) traindataloader = DataLoader(dataset, batchsize=128)

trainer = Trainer( model=ComposerClassifier(module=Model(), numclasses=10), traindataloader=traindataloader, maxduration="2ep", algorithms=[ LabelSmoothing(smoothing=0.1), CutMix(alpha=1.0), ChannelsLast(), ], ) trainer.fit() ```

Next, check out our Getting Started Colab for a walk-through of Composer’s main features. In this tutorial, we will cover the basics of the Composer Trainer:

Dataloader
Trainer
Optimizer and Scheduler
Logging
Training a baseline model
Speeding up training

📚 Learn more

Once you’ve completed the Quick Start, you can go through the below tutorials or our documentation to further familiarize yourself with Composer.

If you have any questions, please feel free to reach out to us on our Community Slack!

Here are some resources actively maintained by the Composer community to help you get started:

Resource	Details
Training BERTs with Composer and 🤗	A Colab Notebook showing how to train BERT models with Composer and 🤗!
Pretraining and Finetuning an LLM Tutorial	A tutorial from MosaicML’s LLM Foundry, using MosaicML Composer, StreamingDataset, and MCLI on training and evaluating LLMs.
Migrating from PyTorch Lightning	A tutorial is to illustrating a path from working in PyTorch Lightning to working in Composer.
Finetuning and Pretraining HuggingFace Models	Want to use Hugging Face models with Composer? No problem. Here, we’ll walk through using Composer to fine-tune a pretrained Hugging Face BERT model.
Building Speedup Methods	A Colab Notebook showing how to build new training modifications on top of Composer

🛠️ For Best Results, Use within the Databricks & MosaicML Ecosystem

Composer can be used on its own, but for the smoothest experience we recommend using it in combination with other components of the MosaicML ecosystem:

We recommend that you train models with Composer, MosaicML StreamingDatasets, and Mosaic AI training.

Mosaic AI training (MCLI)- Our proprietary Command Line Interface (CLI) and Python SDK for orchestrating, scaling, and monitoring the GPU nodes and container images executing training and deployment. Used by our customers for training their own Generative AI models.
- To get started, reach out here and check out our Training product pages
MosaicML LLM Foundry - This open source repository contains code for training, finetuning, evaluating, and preparing LLMs for inference with Composer. Designed to be easy to use, efficient and flexible, this codebase is designed to enable rapid experimentation with the latest techniques.
MosaicML StreamingDataset - Open-source library for fast, accurate streaming from cloud storage.
MosaicML Diffusion - Open-source code to train your own Stable Diffusion model on your own data. Learn more via our blogs: (Results , Speedup Details)

🏆 Project Showcase

Here are some projects and experiments that used Composer. Got something to add? Share in our Community Slack!

MPT Foundation Series: Commercially usable open source LLMs, optimized for fast training and inference and trained with Composer.
Mosaic Diffusion Models: see how we trained a stable diffusion model from scratch for <$50k
replit-code-v1-3b: A 2.7B Causal Language Model focused on Code Completion, trained by Replit on Mosaic AI training in 10 days.
BabyLLM: the first LLM to support both Arabic and English. This 7B model was trained by MetaDialog on the world’s largest Arabic/English dataset to improve customer support workflows (Blog)
BioMedLM: a domain-specific LLM for Bio Medicine built by MosaicML and Stanford CRFM

💫 Contributors

Composer is part of the broader Machine Learning community, and we welcome any contributions, pull requests, or issues!

To start contributing, see our Contributing page.

P.S.: We're hiring!

❓FAQ

What is the best tech stack you recommend when training large models?
- We recommend that users combine components of the MosaicML ecosystem for the smoothest experience:
  - Composer
  - StreamingDataset
  - MCLI (Databricks Mosaic AI Training)
How can I get community support for using Composer?
- You can join our Community Slack!
How does Composer compare to other trainers like NeMo Megatron and PyTorch Lightning?
- We built Composer to be optimized for both simplicity and efficiency. Community users have shared that they enjoy Composer for its capabilities and ease of use compared to alternative libraries.
How do I use Composer to train graph neural networks (GNNs), or Generative Adversarial Networks (GANs), or models for reinforcement learning (RL)?
- We recommend you use alternative libraries for if you want to train these types of models - a lot of assumptions we made when designing Composer are suboptimal for GNNs, RL, and GANs
**How can I speed up HuggingFace downloads?
- You can use hf transfer (pip install hf-transfer) and set the environment variable HF_HUB_ENABLE_HF_TRANSFER=1

✍️ Citation

@misc{mosaicml2022composer, author = {The Mosaic ML Team}, title = {composer}, year = {2021}, howpublished = {\url{https://github.com/mosaicml/composer/}}, }

Owner

Name: Databricks Mosaic Research
Login: mosaicml
Kind: organization
Location: United States of America

Website: https://www.databricks.com/research/mosaic
Twitter: DbrxMosaicAI
Repositories: 9
Profile: https://github.com/mosaicml

We remove the barriers to state-of-the-art generative AI model development and make data + AI available to all

Committers

Last synced: about 1 year ago

All Time

Total Commits: 2,628
Total Committers: 130
Avg Commits per committer: 20.215
Development Distribution Score (DDS): 0.827

Past Year

Commits: 343
Committers: 40
Avg Commits per committer: 8.575
Development Distribution Score (DDS): 0.784

Top Committers

Name	Email	Commits
Mihir Patel	m**7@g**m	454
ravi-mosaicml	r**i@m**m	333
dependabot[bot]	4****]	240
Daniel King	4****g	183
Hanlin Tang	h**n@m**m	136
Evan Racah	e**n@m**m	106
bandish-shah	8****h	55
Charles Tang	j****k	51
Abhi Venigalla	7****c	51
Moin Nadeem	m**n@m**m	50
Daya Khudia	3****a	48
Brian	2****u	44
Landan Seguin	l**s@g**m	43
Saaketh Narayan	s**h@m**m	38
Matthew	g****x	37
bigning	n**g@d**m	36
coryMosaicML	8****L	35
dblalock	d**s@m**m	34
Daniel McNeela	d**a@g**m	34
Jamie Bloxham	j**e@m**m	32
bcui19	b**7@g**m	29
Austin	A****n	26
Vincent Chen	v**t@m**m	26
Karan Jariwala	k**a@g**m	25
nik-mosaic	1****c	25
Irene Dea	d**e@g**m	25
Rishab Parthasarathy	5****a	21
Jeremy D	1****l	21
James Knighton	i**n@g**m	20
Ajay Saini	a**y@m**m	18
and 100 more...

Committer Domains (Top 20 + Academic)

mosaicml.com: 18 databricks.com: 9 helsing.ai: 2 qni.dk: 1 moinnadeem.com: 1 txstate.edu: 1 2.7182.net: 1 cvc.uab.cat: 1 cs.washington.edu: 1 stanford.edu: 1 stevenson.io: 1 ravirahman.com: 1 habana.ai: 1 nyu.edu: 1 posteo.net: 1 amd.com: 1 me.com: 1 shoprunner.com: 1 modal.com: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 114
Total pull requests: 1,891
Average time to close issues: about 2 months
Average time to close pull requests: 12 days
Total issue authors: 80
Total pull request authors: 101
Average comments per issue: 2.89
Average comments per pull request: 0.62
Merged pull requests: 1,167
Bot issues: 4
Bot pull requests: 436

Past Year

Issues: 23
Pull requests: 437
Average time to close issues: 27 days
Average time to close pull requests: 9 days
Issue authors: 20
Pull request authors: 38
Average comments per issue: 1.83
Average comments per pull request: 0.56
Merged pull requests: 254
Bot issues: 1
Bot pull requests: 173

View more stats

Top Authors

Issue Authors

antoinebrl (12)
dependabot[bot] (6)
priba (4)
Ghelfi (4)
JAEarly (3)
mbway (3)
rlrs (2)
anastasia-spb (2)
Longyichen (2)
jmif (2)
naimavahab (2)
amishparekh (2)
mvpatel2000 (2)
tbenthompson (2)
tayshinmit (2)

Pull Request Authors

dependabot[bot] (582)
mvpatel2000 (459)
dakinggg (216)
j316chuck (134)
b-chu (132)
snarayan21 (85)
bigning (79)
eracah (75)
KuuCi (61)
irenedea (47)
rithwik-db (43)
ShashankMosaicML (28)
aspfohl (28)
bmosaicml (22)
jjanezhang (20)

Top Labels

Issue Labels

bug (55) enhancement (40) dependencies (6) good first issue (1) research (1) python (1)

Pull Request Labels

dependencies (582) python (92)

Packages

Total packages: 3
Total downloads:
- pypi 326,846 last-month
Total docker downloads: 17,675,620

Total dependent packages: 4
(may contain duplicates)
Total dependent repositories: 81
(may contain duplicates)
Total versions: 194
Total maintainers: 7

pypi.org: mosaicml

Composer is a PyTorch library that enables you to train neural networks faster, at lower cost, and to higher accuracy.

Homepage: https://github.com/mosaicml/composer
Documentation: https://mosaicml.readthedocs.io/
License: apache-2.0
Latest release: 0.32.1
published 11 months ago

Versions: 69
Dependent Packages: 4
Dependent Repositories: 72
Downloads: 307,899 Last month
Docker Downloads: 17,675,609

Rankings

Docker downloads count: 0.7%

Downloads: 1.0%

Stargazers count: 1.0%

Dependent repos count: 1.8%

Average: 2.0%

Forks count: 2.8%

Dependent packages count: 4.8%

Maintainers (6)

nlsapp richard-mml bandish hanlintang XiaohanZhangCMU pdxting

Last synced: 10 months ago

pypi.org: composer

Composer is a PyTorch library that enables you to train neural networks faster, at lower cost, and to higher accuracy.

Homepage: https://github.com/mosaicml/composer
Documentation: https://composer.readthedocs.io/
License: apache-2.0
Latest release: 0.32.1
published 11 months ago

Versions: 48
Dependent Packages: 0
Dependent Repositories: 9
Downloads: 18,947 Last month
Docker Downloads: 11

Rankings

Stargazers count: 1.0%

Downloads: 2.4%

Forks count: 2.7%

Docker downloads count: 4.1%

Average: 4.2%

Dependent repos count: 4.8%

Dependent packages count: 10.1%

Maintainers (6)

nlsapp richard-mml bandish hanlintang XiaohanZhangCMU ettang

Last synced: 10 months ago

proxy.golang.org: github.com/mosaicml/composer

Documentation: https://pkg.go.dev/github.com/mosaicml/composer#section-documentation
License: apache-2.0
Latest release: v0.32.1
published 11 months ago

Versions: 77
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.6%

Average: 5.8%

Dependent repos count: 5.9%

Last synced: 10 months ago

composer

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Supercharge your Model Training

Deep Learning Framework for Training at Scale

[Website] - [Getting Started] - [Docs] - [We're Hiring!]

👋 Welcome

🔑 Key Features

Scalability

Customizability

Better workflows

Integrations

🚀 Getting Started

📍Prerequisites

💾 Installation

🏁 Quick Start

📚 Learn more

🛠️ For Best Results, Use within the Databricks & MosaicML Ecosystem

🏆 Project Showcase

💫 Contributors

❓FAQ

✍️ Citation

Owner

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: mosaicml

Rankings

Maintainers (6)

pypi.org: composer

Rankings

Maintainers (6)

proxy.golang.org: github.com/mosaicml/composer

Rankings

Dependencies