nemo-aligner

Scalable toolkit for efficient model alignment

https://github.com/nvidia/nemo-aligner

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Keywords from Contributors

interpretability yolov5s meshing pipeline-testing datacleaner data-profilers battery parallel animal histogram

Last synced: 6 months ago · JSON representation ·

Repository

Scalable toolkit for efficient model alignment

Basic Info

Host: GitHub
Owner: NVIDIA
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 7.58 MB

Statistics

Stars: 837
Watchers: 24
Forks: 100
Open Issues: 122
Releases: 11

Created over 2 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog Contributing License Citation Security

NVIDIA NeMo-Aligner

⚠️ As of 5/15/2025, this repository is no longer actively maintained. We recommend switching to NeMo RL, a scalable and modular post-training library with seamless Hugging Face integration, Megatron Core optimizations, and uses Ray as the scheduling backbone. ⚠️

Latest News

We released Nemotron-4-340B Base, Instruct, Reward. The Instruct and Reward variants are trained in Nemo-Aligner. Please see the Helpsteer2 paper for more details on the reward model training.
We are excited to announce the release of accelerated generation support in our RLHF pipeline using TensorRT-LLM. For more information, please refer to our RLHF documentation.
NeMo-Aligner Paper is now out on arxiv!

Introduction

NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit has support for state-of-the-art model alignment algorithms such as SteerLM, DPO, and Reinforcement Learning from Human Feedback (RLHF). These algorithms enable users to align language models to be more safe, harmless, and helpful. Users can perform end-to-end model alignment on a wide range of model sizes and take advantage of all the parallelism techniques to ensure their model alignment is done in a performant and resource-efficient manner. For more technical details, please refer to our paper.

The NeMo-Aligner toolkit is built using the NeMo Framework, which enables scalable training across thousands of GPUs using tensor, data, and pipeline parallelism for all alignment components. Additionally, our checkpoints are cross-compatible with the NeMo ecosystem, facilitating inference deployment and further customization (https://github.com/NVIDIA/NeMo-Aligner).

The toolkit is currently in it's early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models.

Key Features

SteerLM: Attribute Conditioned SFT as an (User-Steerable) alternative to RLHF.
- Llama3-70B-SteerLM-Chat aligned with NeMo-Aligner.
- Corresponding reward model Llama3-70B-SteerLM-RM.
- Learn more at our SteerLM and HelpSteer2 papers.
Supervised Fine Tuning
Reward Model Training
Reinforcement Learning from Human Feedback using the PPO Algorithm
- Llama3-70B-PPO-Chat aligned with NeMo-Aligner using TRT-LLM.
Reinforcement Learning from Human Feedback using the REINFORCE Algorithm
- Llama-3.1-Nemotron-70B-Instruct aligned with NeMo-Aligner using TRT-LLM.
Direct Preference Optimization as described in Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Llama3-70B-DPO-Chat aligned with NeMo Aligner.
Self-Play Fine-Tuning (SPIN) as described in Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Learn More

Latest Release

For the latest stable release, please see the releases page. All releases come with a pre-built container. Changes within each release will be documented in CHANGELOG.

Install Your Own Environment

Requirements

NeMo-Aligner has the same requirements as the NeMo Toolkit Requirements with the addition of PyTriton.

Quick start inside NeMo container

NeMo Aligner comes included with NeMo containers. On a machine with NVIDIA GPUs and drivers installed run NeMo container: bash docker run --gpus all -it --rm --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/nemo:24.07 Once you are inside the container, NeMo-Aligner is already installed and together with NeMo and other tools can be found under /opt/ folder.

Install NeMo-Aligner

Please follow the same steps as outlined in the NeMo Toolkit Installation Guide. After installing NeMo, execute the following additional command: bash pip install nemo-aligner Alternatively, if you prefer to install the latest commit: bash pip install .

Docker Containers

We provide an official NeMo-Aligner Dockerfile which is based on stable, tested versions of NeMo, Megatron-LM, and TransformerEngine. The primary objective of this Dockerfile is to ensure stability, although it might not always reflect the very latest versions of those three packages. You can access our Dockerfile here.

Alternatively, you can build the NeMo Dockerfile here NeMo Dockerfile and add RUN pip install nemo-aligner at the end.

Future work

We will continue improving the stability of the PPO learning phase.
Improve the performance of RLHF.
Add TRT-LLM inference support for Rejection Sampling.

Contribute to NeMo-Aligner

We welcome community contributions! Please refer to CONTRIBUTING.md for guidelines.

Cite NeMo-Aligner in Your Work

@misc{shen2024nemoaligner, title={NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment}, author={Gerald Shen and Zhilin Wang and Olivier Delalleau and Jiaqi Zeng and Yi Dong and Daniel Egert and Shengyang Sun and Jimmy Zhang and Sahil Jain and Ali Taghibakhshi and Markel Sanz Ausin and Ashwath Aithal and Oleksii Kuchaiev}, year={2024}, eprint={2405.01481}, archivePrefix={arXiv}, primaryClass={cs.CL} }

License

This toolkit is licensed under the Apache License, Version 2.0.

Owner

Name: NVIDIA Corporation
Login: NVIDIA
Kind: organization
Location: 2788 San Tomas Expressway, Santa Clara, CA, 95051

Website: https://nvidia.com
Repositories: 342
Profile: https://github.com/NVIDIA

Citation (CITATION.cff)

cff-version: 1.0.0
message: "If you use this software, please cite it as below."
title: "NeMo-Aligner: a toolkit for model alignment"
repository-code: https://github.com/NVIDIA/NeMo-Aligner
authors:
  - family-names: Shen
    given-names: Gerald
  - family-names: Delalleau
    given-names: Olivier
  - family-names: Jian
    given-names: Sahil
  - family-names: Zhang
    given-names: Jimmy
  - family-names: Zeng
    given-names: Jiaqi
  - family-names: Egert
    given-names: Daniel
  - family-names: Wang
    given-names: Zhilin
  - family-names: Yan
    given-names: Zijie
  - family-names: Dong
    given-names: Yi
  - family-names: Markel 
    given-names: Ausin
  - family-names: Taghibakhshi
    given-names: Ali
  - family-names: Tao
    given-names: Li
  - family-names: Hu
    given-names: Jian
  - family-names: Yao
    given-names: Xin
  - family-names: Liu
    given-names: Hongbin
  - family-names: Aithal
    given-names: Ashwath
  - family-names: Kuchaiev
    given-names: Oleksii

Committers

Last synced: 11 months ago

All Time

Total Commits: 230
Total Committers: 30
Avg Commits per committer: 7.667
Development Distribution Score (DDS): 0.739

Past Year

Commits: 165
Committers: 26
Avg Commits per committer: 6.346
Development Distribution Score (DDS): 0.721

Top Committers

Name	Email	Commits
Gerald Shen	1****m	60
oliver könig	o**g@n**m	46
Terry Kong	t**k@n**m	39
Anna Shors	7****1	16
trias702	2****2	13
Olivier Delalleau	5****u	10
HeyyyyyyG	4****G	8
Shengyang Sun	1****s	4
Alexander Bukharin	5****3	3
Rohit Jena	r****o	3
gleibovich-nvidia	1****a	3
Ali Taghibakhshi	7****0	2
George Armstrong	g**a@n**m	2
Julien Veron Vialard	5****d	2
Oleksii Kuchaiev	o****v	2
Yi Dong	4****2	2
Zhilin Wang	z**w@n**m	2
pre-commit-ci[bot]	6****]	1
jgerh	1****h	1
github-actions[bot]	4****]	1
Soumye Singhal	s**e@g**m	1
Sahil Jain	4****4	1
Maanu Grover	1****v	1
Igor Gitman	i**n@g**m	1
Dong Hyuk Chang	t**6@t**m	1
Chris Alexiuk	1****a	1
Charlie Truong	c**g@n**m	1
Andrew Schilling	8****v	1
Alexandros Koumparoulis	1****a	1
Adi Renduchintala	a**r@g**m	1

Committer Domains (Top 20 + Academic)

nvidia.com: 5 tutanota.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 59
Total pull requests: 391
Average time to close issues: 17 days
Average time to close pull requests: 10 days
Total issue authors: 26
Total pull request authors: 46
Average comments per issue: 1.29
Average comments per pull request: 0.37
Merged pull requests: 269
Bot issues: 0
Bot pull requests: 25

Past Year

Issues: 25
Pull requests: 298
Average time to close issues: 6 days
Average time to close pull requests: 9 days
Issue authors: 19
Pull request authors: 32
Average comments per issue: 1.32
Average comments per pull request: 0.21
Merged pull requests: 212
Bot issues: 0
Bot pull requests: 21

View more stats

Top Authors

Issue Authors

odelalleau (20)
gshennvm (9)
shengyangs (6)
terrykong (5)
Cppowboy (5)
AtsunoriFujita (4)
DZ9 (2)
gleibovich-nvidia (2)
panjianfei (2)
mrm-196 (2)
sunilitggu (2)
okuchaiev (2)
noamgai21 (1)
rundiffusion (1)
arunasank (1)

Pull Request Authors

ko3n1g (185)
terrykong (126)
gshennvm (82)
github-actions[bot] (58)
ashors1 (54)
trias702 (24)
odelalleau (20)
HeyyyyyyG (13)
arendu (9)
SahilJain314 (9)
abukharin3 (9)
jveronvialard (6)
rohitrango (5)
shengyangs (5)
jgerh (4)

Top Labels

Issue Labels

bug (40) Utils (3) documentation (3) CI (3) Run CICD (3) Algorithms (2) enhancement (1)

Pull Request Labels

Run CICD (177) documentation (149) CI (145) Utils (144) Algorithms (111) Skip CICD (40) r0.5.0 (26) Servers (14) r0.7.0 (10) r0.6.0 (8) bug (2) cherry-pick (2) enhancement (2)

Packages

Total packages: 4
Total downloads:
- pypi 211 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 31
Total maintainers: 3

proxy.golang.org: github.com/NVIDIA/NeMo-Aligner

Documentation: https://pkg.go.dev/github.com/NVIDIA/NeMo-Aligner#section-documentation
License: apache-2.0
Latest release: v0.7.0
published 12 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 7 months ago

proxy.golang.org: github.com/nvidia/nemo-aligner

Documentation: https://pkg.go.dev/github.com/nvidia/nemo-aligner#section-documentation
License: apache-2.0
Latest release: v0.7.0
published 12 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 7.0%

Last synced: 7 months ago

proxy.golang.org: github.com/nvidia/NeMo-Aligner

Documentation: https://pkg.go.dev/github.com/nvidia/NeMo-Aligner#section-documentation
License: apache-2.0
Latest release: v0.7.0
published 12 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 7.0%

Last synced: 7 months ago

pypi.org: nemo-aligner

NeMo-Aligner - a toolkit for model alignment

Homepage: https://github.com/NVIDIA/NeMo-Aligner
Documentation: https://nemo-aligner.readthedocs.io/
License: Apache2
Latest release: 0.7.0
published 12 months ago

Versions: 10
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 211 Last month

Rankings

Stargazers count: 8.3%

Dependent packages count: 10.2%

Downloads: 12.3%

Average: 25.5%

Forks count: 29.8%

Dependent repos count: 67.1%

Maintainers (3)

geshen ko3n1g terryk-nvidia

Last synced: 7 months ago

Dependencies

.github/workflows/labeler.yaml actions

actions/labeler v4 composite

pyproject.toml pypi

requirements.txt pypi

nemo_toolkit *
nvidia-pytriton *

setup.py pypi

nemo-aligner

Science Score: 54.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

NVIDIA NeMo-Aligner

Latest News

Introduction

Key Features

Learn More

Latest Release

Install Your Own Environment

Requirements

Quick start inside NeMo container

Install NeMo-Aligner

Docker Containers

Future work

Contribute to NeMo-Aligner

Cite NeMo-Aligner in Your Work

License

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/NVIDIA/NeMo-Aligner

Rankings

proxy.golang.org: github.com/nvidia/nemo-aligner

Rankings

proxy.golang.org: github.com/nvidia/NeMo-Aligner

Rankings

pypi.org: nemo-aligner

Rankings

Maintainers (3)

Dependencies