trl
Train transformer language & seq2seq(Whisper only) models with reinforcement learning.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (1.9%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
·
Repository
Train transformer language & seq2seq(Whisper only) models with reinforcement learning.
Basic Info
- Host: GitHub
- Owner: Neulus
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: http://hf.co/docs/trl
- Size: 8.81 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created about 1 year ago
· Last pushed 12 months ago
Metadata Files
Readme
Contributing
License
Code of conduct
Citation
README.md
Some hacky fork of TRL.
Forked to enable Whisper GRPO training for various purposes. Not suitable for production use.
Code probably won't work beside Whisper models. Also probably won't work for whisper-large-v3 (Due to new sampling_rate).
Owner
- Login: Neulus
- Kind: user
- Repositories: 3
- Profile: https://github.com/Neulus
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'TRL: Transformer Reinforcement Learning'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Leandro
family-names: von Werra
- given-names: Younes
family-names: Belkada
- given-names: Lewis
family-names: Tunstall
- given-names: Edward
family-names: Beeching
- given-names: Tristan
family-names: Thrush
- given-names: Nathan
family-names: Lambert
- given-names: Shengyi
family-names: Huang
- given-names: Kashif
family-names: Rasul
- given-names: Quentin
family-names: Gallouédec
repository-code: 'https://github.com/huggingface/trl'
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
keywords:
- rlhf
- deep-learning
- pytorch
- transformers
license: Apache-2.0
version: 0.14
GitHub Events
Total
- Push event: 8
Last Year
- Push event: 8