https://github.com/cyberagentailab/srbon
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: CyberAgentAILab
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Size: 31.3 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Stochastic Regularized Best-of-N Sampling (SRBoN)
This repository contains the implementation of Stochastic Regularized Best-of-N Sampling (SRBoN).
The code is tested using Python 3.8 and CUDA 11.0
Setup Instructions
Step 1: Environment Setup
Create a virtual environment, then install the required dependencies:
```bash
Create virtual environment
python3 -m venv env source env/bin/activate
Install dependencies
pip install -r requirements.txt ```
Step 2: Sample Collection and Metric Computation
Collect Samples
To collect samples from the model, use the following command. You can specify the dataset, model, and number of samples:
bash
bash ./experiments/sample.sh -d [DATASETS] -m [MODEL] -s [NUMBER_OF_SAMPLES]
Compute Metrics
You can compute various utility metrics such as log probability, Wasserstein distance, and token length using the following scripts:
```bash
Compute log probability
bash ./experiments/computelogprob.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES]
Compute Wasserstein distance
bash ./experiments/computewd.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES]
Compute token length
bash ./experiments/computelength.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES] ```
Compute Reward Values
To compute reward values for specific datasets, you can use the following command. Here, specify the dataset, number of samples, and the reward type:
bash
bash ./experiments/compute_reward.sh -d [DATASETS] -s [NUMBER_OF_SAMPLES] -i [REWARD_TYPE]
Step 3: Running SRBoN
Finally, to compute the SRBoN values, run the following script:
bash
python3 stochastic_rbon/stochastic_rbon.py --dataset [DATASETS] --ncandidates [NUMBER_OF_SAMPLES]
Examples
Below is an example of running the SRBoN pipeline using the alpaca dataset, the HuggingFaceH4/mistral-7b-sft-beta model, and 100 samples.
```bash
Collect 100 samples from the model
bash ./experiments/sample.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100
Compute log probabilities for the collected samples
bash ./experiments/compute_logprob.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100
Compute Wasserstein distance for the samples
bash ./experiments/compute_wd.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100
Compute the token length of the samples
bash ./experiments/compute_length.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100 ```
For computing reward values using different reward models:
```bash
Compute reward values using the OpenAssistant reward model
bash ./experiments/compute_reward.sh -d alpaca -s 100 -i OpenAssistant/reward-model-deberta-v3-large-v2
Compute reward values using the openbmb reward model
bash ./experiments/compute_reward.sh -d alpaca -s 100 -i openbmb/Eurus-RM-7b ```
Finally, running the SRBoN computation with 100 candidates:
bash
python3 stochastic_rbon/stochastic_rbon.py --dataset alpaca --ncandidates 100
Reference
Yuki, I., Jinnai, Y., Morimura, T., Abe, K., Ariu, K., Sakamoto, M., and Uchibe, E. "Evaluation of Best-of-N Sampling Strategies for Language Model Alignment."
Bibtex:
bash
@article{
ichihara2025evaluation,
title={Evaluation of Best-of-N Sampling Strategies for Language Model Alignment},
author={Yuki Ichihara and Yuu Jinnai and Tetsuro Morimura and Kenshi Abe and Kaito Ariu and Mitsuki Sakamoto and Eiji Uchibe},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=H4S4ETc8c9},
note={}
}
Owner
- Name: CyberAgent AI Lab
- Login: CyberAgentAILab
- Kind: organization
- Location: Japan
- Website: https://cyberagent.ai/ailab/
- Twitter: cyberagent_ai
- Repositories: 7
- Profile: https://github.com/CyberAgentAILab
GitHub Events
Total
- Watch event: 5
- Push event: 1
- Public event: 1
Last Year
- Watch event: 5
- Push event: 1
- Public event: 1
Dependencies
- POT *
- absl-py *
- accelerate *
- bert_score *
- bitsandbytes *
- datasets *
- einops *
- evaluate *
- google-cloud-storage *
- llm-blender *
- nltk ==3.8.1
- peft *
- py7zr *
- rouge-score ==0.1.2
- sacremoses ==0.0.53
- scikit-learn-extra *
- sortedcontainers *
- subword-nmt *
- torchmetrics *
- tqdm *
- transformers *
- unbabel-comet *