trl

Train transformer language & seq2seq(Whisper only) models with reinforcement learning.

https://github.com/neulus/trl

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (1.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Train transformer language & seq2seq(Whisper only) models with reinforcement learning.

Basic Info
  • Host: GitHub
  • Owner: Neulus
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage: http://hf.co/docs/trl
  • Size: 8.81 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 12 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Some hacky fork of TRL.

Forked to enable Whisper GRPO training for various purposes. Not suitable for production use.

Code probably won't work beside Whisper models. Also probably won't work for whisper-large-v3 (Due to new sampling_rate).

Owner

  • Login: Neulus
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'TRL: Transformer Reinforcement Learning'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Leandro
    family-names: von Werra
  - given-names: Younes
    family-names: Belkada
  - given-names: Lewis
    family-names: Tunstall
  - given-names: Edward
    family-names: Beeching
  - given-names: Tristan
    family-names: Thrush
  - given-names: Nathan
    family-names: Lambert
  - given-names: Shengyi
    family-names: Huang
  - given-names: Kashif
    family-names: Rasul
  - given-names: Quentin
    family-names: Gallouédec
repository-code: 'https://github.com/huggingface/trl'
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
keywords:
  - rlhf
  - deep-learning
  - pytorch
  - transformers
license: Apache-2.0
version: 0.14

GitHub Events

Total
  • Push event: 8
Last Year
  • Push event: 8