trl-sandbox

https://github.com/ivangabriele/trl-sandbox

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (1.7%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: ivangabriele
License: apache-2.0
Language: Python
Default Branch: main
Size: 807 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 1
Releases: 0

Created 9 months ago · Last pushed 9 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

README.md

title: TRL Sandbox emoji: 🧪 colorFrom: gray colorTo: gray sdk: docker

pinned: false

TRL Sandbox

Owner

Name: Ivan Gabriele
Login: ivangabriele
Kind: user
Location: Paris
Company: @betagouv

Repositories: 84
Profile: https://github.com/ivangabriele

Passionate about everything.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'TRL: Transformer Reinforcement Learning'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Leandro
    family-names: von Werra
  - given-names: Younes
    family-names: Belkada
  - given-names: Lewis
    family-names: Tunstall
  - given-names: Edward
    family-names: Beeching
  - given-names: Tristan
    family-names: Thrush
  - given-names: Nathan
    family-names: Lambert
  - given-names: Shengyi
    family-names: Huang
  - given-names: Kashif
    family-names: Rasul
  - given-names: Quentin
    family-names: Gallouédec
repository-code: 'https://github.com/huggingface/trl'
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
keywords:
  - rlhf
  - deep-learning
  - pytorch
  - transformers
license: Apache-2.0
version: 0.18

GitHub Events

Total

Push event: 1
Pull request event: 2

Last Year

Push event: 1
Pull request event: 2

Issues and Pull Requests

Last synced: 6 months ago

Dependencies

Dockerfile docker

nvidia/cuda 12.9.0-cudnn-devel-ubuntu24.04 build

docker/trl-latest-gpu/Dockerfile docker

continuumio/miniconda3 latest build
nvidia/cuda 12.2.2-devel-ubuntu22.04 build

docker/trl-source-gpu/Dockerfile docker

continuumio/miniconda3 latest build
nvidia/cuda 12.2.2-devel-ubuntu22.04 build

docker-compose.yml docker

examples/research_projects/stack_llama_2/scripts/requirements.txt pypi

accelerate *
bitsandbytes *
datasets *
peft *
transformers *
trl *
wandb *

pyproject.toml pypi

accelerate ==0.34.2
datasets >=3.6.0
deepspeed >=0.17.1
huggingface [cli]>=0.0.1
peft >=0.15.2
torch ==2.4.*
transformers ==4.50.*
trl ==0.18.*

setup.py pypi

uv.lock pypi

accelerate 0.34.2
aiohappyeyeballs 2.6.1
aiohttp 3.12.13
aiosignal 1.3.2
annotated-types 0.7.0
attrs 25.3.0
certifi 2025.6.15
charset-normalizer 3.4.2
colorama 0.4.6
datasets 3.6.0
deepspeed 0.17.1
dill 0.3.8
einops 0.8.1
filelock 3.18.0
frozenlist 1.7.0
fsspec 2025.3.0
hf-xet 1.1.3
hjson 3.1.0
huggingface 0.0.1
huggingface-hub 0.33.0
idna 3.10
jinja2 3.1.6
markupsafe 3.0.2
mpmath 1.3.0
msgpack 1.1.1
multidict 6.4.4
multiprocess 0.70.16
networkx 3.5
ninja 1.11.1.4
numpy 2.3.0
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvtx-cu12 12.1.105
packaging 25.0
pandas 2.3.0
peft 0.15.2
propcache 0.3.2
psutil 7.0.0
py-cpuinfo 9.0.0
pyarrow 20.0.0
pydantic 2.11.7
pydantic-core 2.33.2
python-dateutil 2.9.0.post0
pytz 2025.2
pyyaml 6.0.2
regex 2024.11.6
requests 2.32.4
safetensors 0.5.3
setuptools 80.9.0
six 1.17.0
sympy 1.14.0
tokenizers 0.21.1
torch 2.4.1
tqdm 4.67.1
transformers 4.50.3
triton 3.0.0
trl 0.18.2
trl-sandbox 0.0.0
typing-extensions 4.14.0
typing-inspection 0.4.1
tzdata 2025.2
urllib3 2.4.0
xxhash 3.5.0
yarl 1.20.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science