https://github.com/amir22010/lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Code for the paper Fine-Tuning Language Models from Human Preferences
Basic Info
- Host: GitHub
- Owner: Amir22010
- License: mit
- Default Branch: master
- Homepage: https://openai.com/blog/fine-tuning-gpt-2/
- Size: 57.6 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of openai/lm-human-preferences
Created over 6 years ago
· Last pushed over 6 years ago
https://github.com/Amir22010/lm-human-preferences/blob/master/
**Status:** Archive (code is provided as-is, no updates expected)
# lm-human-preferences
This repository contains code for the paper [Fine-Tuning Language Models from Human Preferences](https://arxiv.org/abs/1909.08593). See also our [blog post](https://openai.com/blog/fine-tuning-gpt-2/).
We provide code for:
- Training reward models from human labels
- Fine-tuning language models using those reward models
It does not contain code for generating labels. However, we have released human labels collected for our experiments, at `gs://lm-human-preferences/labels`.
For those interested, the question and label schemas are simple and documented in [`label_types.py`](./lm_human_preferences/label_types.py).
The code has only been tested using the smallest GPT-2 model (124M parameters).
## Instructions
This code has only been tested using Python 3.7.3. Training has been tested on GCE machines with 8 V100s, running Ubuntu 16.04, but development also works on Mac OS X.
### Installation
- Install [pipenv](https://github.com/pypa/pipenv#installation).
- Install [tensorflow](https://www.tensorflow.org/install/gpu): Install CUDA 10.0 and cuDNN 7.6.2, then `pipenv install tensorflow-gpu==1.13.1`. The code may technically run with tensorflow on CPU but will be very slow.
- Install [`gsutil`](https://cloud.google.com/storage/docs/gsutil_install)
- Clone this repo. Then:
```
pipenv install
```
- (Recommended) Install [`horovod`](https://github.com/horovod/horovod#install) to speed up the code, or otherwise substitute some fast implementation in the `mpi_allreduce_sum` function of [`core.py`](./lm_human_preferences/utils/core.py). Make sure to use pipenv for the install, e.g. `pipenv install horovod==0.18.1`.
### Running
The following examples assume we are aiming to train a model to continue text in a physically descriptive way.
You can read [`launch.py`](./launch.py) to see how the `descriptiveness` experiments and others are defined.
Note that we provide pre-trained models, so you can skip directly to RL fine-tuning or even to sampling from a trained policy, if desired.
#### Training a reward model
To train a reward model, use a command such as
```
experiment=descriptiveness
reward_experiment_name=testdesc-$(date +%y%m%d%H%M)
pipenv run ./launch.py train_reward $experiment $reward_experiment_name
```
This will save outputs (and tensorboard event files) to the directory `/tmp/save/train_reward/$reward_experiment_name`. The directory can be changed via the `--save_dir` flag.
#### Finetuning a language model
Once you have trained a reward model, you can finetune against it.
First, set
```
trained_reward_model=/tmp/save/train_reward/$reward_experiment_name
```
or if using our pretrained model,
```
trained_reward_model=gs://lm-human-preferences/runs/descriptiveness/reward_model
```
Then,
```
experiment=descriptiveness
policy_experiment_name=testdesc-$(date +%y%m%d%H%M)
pipenv run ./launch.py train_policy $experiment $policy_experiment_name --rewards.trained_model $trained_reward_model --rewards.train_new_model 'off'
```
This will save outputs (and tensorboard event files) to the directory `/tmp/save/train_policy/$policy_experiment_name`. The directory can be changed via the `--save_dir` flag.
#### Both steps at once
You can run a single command to train a reward model and then finetune against it
```
experiment=descriptiveness
experiment_name=testdesc-$(date +%y%m%d%H%M)
pipenv run ./launch.py train_policy $experiment $experiment_name
```
In this case, outputs are in the directory `/tmp/save/train_policy/$policy_experiment_name`, and the reward model is saved to a subdirectory `reward_model`. The directory can be changed via the `--save_dir` flag.
#### Sampling from a trained policy
Specify the policy to load:
```
save_dir=/tmp/save/train_policy/$policy_experiment_name
```
or if using our pretrained model,
```
save_dir=gs://lm-human-preferences/runs/descriptiveness
```
Then run:
```
pipenv run ./sample.py sample --save_dir $save_dir --savescope policy
```
Note that this script can run on less than 8 GPUs. You can pass the flag `--mpi 1`, for exapmle, if you only have one GPU.
## LICENSE
[MIT](./LICENSE)
## Citation
Please cite the paper with the following bibtex entry:
```
@article{ziegler2019finetuning,
title={Fine-Tuning Language Models from Human Preferences},
author={Ziegler, Daniel M. and Stiennon, Nisan and Wu, Jeffrey and Brown, Tom B. and Radford, Alec and Amodei, Dario and Christiano, Paul and Irving, Geoffrey},
journal={arXiv preprint arXiv:1909.08593},
url={https://arxiv.org/abs/1909.08593},
year={2019}
}
```
Owner
- Name: Amir Khan
- Login: Amir22010
- Kind: user
- Location: India
- Repositories: 3
- Profile: https://github.com/Amir22010
working on developing a state of art AI solutions mainly in computer vision, chat bots and nlp domain. building an awesome AI as a professional developer 😍.