trl

Train transformer language & seq2seq(Whisper only) models with reinforcement learning.

https://github.com/neulus/trl

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (1.9%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Train transformer language & seq2seq(Whisper only) models with reinforcement learning.

Basic Info

Host: GitHub
Owner: Neulus
License: apache-2.0
Language: Python
Default Branch: main
Homepage: http://hf.co/docs/trl
Size: 8.81 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed 12 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

README.md

Some hacky fork of TRL.

Forked to enable Whisper GRPO training for various purposes. Not suitable for production use.

Code probably won't work beside Whisper models. Also probably won't work for whisper-large-v3 (Due to new sampling_rate).

Owner

Login: Neulus
Kind: user

Repositories: 3
Profile: https://github.com/Neulus

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'TRL: Transformer Reinforcement Learning'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Leandro
    family-names: von Werra
  - given-names: Younes
    family-names: Belkada
  - given-names: Lewis
    family-names: Tunstall
  - given-names: Edward
    family-names: Beeching
  - given-names: Tristan
    family-names: Thrush
  - given-names: Nathan
    family-names: Lambert
  - given-names: Shengyi
    family-names: Huang
  - given-names: Kashif
    family-names: Rasul
  - given-names: Quentin
    family-names: Gallouédec
repository-code: 'https://github.com/huggingface/trl'
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
keywords:
  - rlhf
  - deep-learning
  - pytorch
  - transformers
license: Apache-2.0
version: 0.14

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

trl