Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (1.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: ivangabriele
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 807 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created 9 months ago · Last pushed 9 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md


title: TRL Sandbox emoji: 🧪 colorFrom: gray colorTo: gray sdk: docker

pinned: false

TRL Sandbox

Static Badge

Owner

  • Name: Ivan Gabriele
  • Login: ivangabriele
  • Kind: user
  • Location: Paris
  • Company: @betagouv

Passionate about everything.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'TRL: Transformer Reinforcement Learning'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Leandro
    family-names: von Werra
  - given-names: Younes
    family-names: Belkada
  - given-names: Lewis
    family-names: Tunstall
  - given-names: Edward
    family-names: Beeching
  - given-names: Tristan
    family-names: Thrush
  - given-names: Nathan
    family-names: Lambert
  - given-names: Shengyi
    family-names: Huang
  - given-names: Kashif
    family-names: Rasul
  - given-names: Quentin
    family-names: Gallouédec
repository-code: 'https://github.com/huggingface/trl'
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
keywords:
  - rlhf
  - deep-learning
  - pytorch
  - transformers
license: Apache-2.0
version: 0.18

GitHub Events

Total
  • Push event: 1
  • Pull request event: 2
Last Year
  • Push event: 1
  • Pull request event: 2

Issues and Pull Requests

Last synced: 6 months ago


Dependencies

Dockerfile docker
  • nvidia/cuda 12.9.0-cudnn-devel-ubuntu24.04 build
docker/trl-latest-gpu/Dockerfile docker
  • continuumio/miniconda3 latest build
  • nvidia/cuda 12.2.2-devel-ubuntu22.04 build
docker/trl-source-gpu/Dockerfile docker
  • continuumio/miniconda3 latest build
  • nvidia/cuda 12.2.2-devel-ubuntu22.04 build
docker-compose.yml docker
examples/research_projects/stack_llama_2/scripts/requirements.txt pypi
  • accelerate *
  • bitsandbytes *
  • datasets *
  • peft *
  • transformers *
  • trl *
  • wandb *
pyproject.toml pypi
  • accelerate ==0.34.2
  • datasets >=3.6.0
  • deepspeed >=0.17.1
  • huggingface [cli]>=0.0.1
  • peft >=0.15.2
  • torch ==2.4.*
  • transformers ==4.50.*
  • trl ==0.18.*
setup.py pypi
uv.lock pypi
  • accelerate 0.34.2
  • aiohappyeyeballs 2.6.1
  • aiohttp 3.12.13
  • aiosignal 1.3.2
  • annotated-types 0.7.0
  • attrs 25.3.0
  • certifi 2025.6.15
  • charset-normalizer 3.4.2
  • colorama 0.4.6
  • datasets 3.6.0
  • deepspeed 0.17.1
  • dill 0.3.8
  • einops 0.8.1
  • filelock 3.18.0
  • frozenlist 1.7.0
  • fsspec 2025.3.0
  • hf-xet 1.1.3
  • hjson 3.1.0
  • huggingface 0.0.1
  • huggingface-hub 0.33.0
  • idna 3.10
  • jinja2 3.1.6
  • markupsafe 3.0.2
  • mpmath 1.3.0
  • msgpack 1.1.1
  • multidict 6.4.4
  • multiprocess 0.70.16
  • networkx 3.5
  • ninja 1.11.1.4
  • numpy 2.3.0
  • nvidia-cublas-cu12 12.1.3.1
  • nvidia-cuda-cupti-cu12 12.1.105
  • nvidia-cuda-nvrtc-cu12 12.1.105
  • nvidia-cuda-runtime-cu12 12.1.105
  • nvidia-cudnn-cu12 9.1.0.70
  • nvidia-cufft-cu12 11.0.2.54
  • nvidia-curand-cu12 10.3.2.106
  • nvidia-cusolver-cu12 11.4.5.107
  • nvidia-cusparse-cu12 12.1.0.106
  • nvidia-nccl-cu12 2.20.5
  • nvidia-nvjitlink-cu12 12.6.85
  • nvidia-nvtx-cu12 12.1.105
  • packaging 25.0
  • pandas 2.3.0
  • peft 0.15.2
  • propcache 0.3.2
  • psutil 7.0.0
  • py-cpuinfo 9.0.0
  • pyarrow 20.0.0
  • pydantic 2.11.7
  • pydantic-core 2.33.2
  • python-dateutil 2.9.0.post0
  • pytz 2025.2
  • pyyaml 6.0.2
  • regex 2024.11.6
  • requests 2.32.4
  • safetensors 0.5.3
  • setuptools 80.9.0
  • six 1.17.0
  • sympy 1.14.0
  • tokenizers 0.21.1
  • torch 2.4.1
  • tqdm 4.67.1
  • transformers 4.50.3
  • triton 3.0.0
  • trl 0.18.2
  • trl-sandbox 0.0.0
  • typing-extensions 4.14.0
  • typing-inspection 0.4.1
  • tzdata 2025.2
  • urllib3 2.4.0
  • xxhash 3.5.0
  • yarl 1.20.1