https://github.com/cyberagentailab/srbon

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: CyberAgentAILab
License: gpl-3.0
Language: Python
Default Branch: main
Size: 31.3 KB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme License

Stochastic Regularized Best-of-N Sampling (SRBoN)

This repository contains the implementation of Stochastic Regularized Best-of-N Sampling (SRBoN).

The code is tested using Python 3.8 and CUDA 11.0

Setup Instructions

Step 1: Environment Setup

Create a virtual environment, then install the required dependencies:

```bash

Create virtual environment

python3 -m venv env source env/bin/activate

Install dependencies

pip install -r requirements.txt ```

Step 2: Sample Collection and Metric Computation

Collect Samples

To collect samples from the model, use the following command. You can specify the dataset, model, and number of samples:

bash bash ./experiments/sample.sh -d [DATASETS] -m [MODEL] -s [NUMBER_OF_SAMPLES]

Compute Metrics

You can compute various utility metrics such as log probability, Wasserstein distance, and token length using the following scripts:

```bash

Compute log probability

bash ./experiments/computelogprob.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES]

Compute Wasserstein distance

bash ./experiments/computewd.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES]

Compute token length

bash ./experiments/computelength.sh -d [DATASETS] -m [MODEL] -s [NUMBEROF_SAMPLES] ```

Compute Reward Values

To compute reward values for specific datasets, you can use the following command. Here, specify the dataset, number of samples, and the reward type:

bash bash ./experiments/compute_reward.sh -d [DATASETS] -s [NUMBER_OF_SAMPLES] -i [REWARD_TYPE]

Step 3: Running SRBoN

Finally, to compute the SRBoN values, run the following script:

bash python3 stochastic_rbon/stochastic_rbon.py --dataset [DATASETS] --ncandidates [NUMBER_OF_SAMPLES]

Examples

Below is an example of running the SRBoN pipeline using the alpaca dataset, the HuggingFaceH4/mistral-7b-sft-beta model, and 100 samples.

```bash

Collect 100 samples from the model

bash ./experiments/sample.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100

Compute log probabilities for the collected samples

bash ./experiments/compute_logprob.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100

Compute Wasserstein distance for the samples

bash ./experiments/compute_wd.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100

Compute the token length of the samples

bash ./experiments/compute_length.sh -d alpaca -m HuggingFaceH4/mistral-7b-sft-beta -s 100 ```

For computing reward values using different reward models:

```bash

Compute reward values using the OpenAssistant reward model

bash ./experiments/compute_reward.sh -d alpaca -s 100 -i OpenAssistant/reward-model-deberta-v3-large-v2

Compute reward values using the openbmb reward model

bash ./experiments/compute_reward.sh -d alpaca -s 100 -i openbmb/Eurus-RM-7b ```

Finally, running the SRBoN computation with 100 candidates:

bash python3 stochastic_rbon/stochastic_rbon.py --dataset alpaca --ncandidates 100

Reference

Yuki, I., Jinnai, Y., Morimura, T., Abe, K., Ariu, K., Sakamoto, M., and Uchibe, E. "Evaluation of Best-of-N Sampling Strategies for Language Model Alignment."

Bibtex: bash @article{ ichihara2025evaluation, title={Evaluation of Best-of-N Sampling Strategies for Language Model Alignment}, author={Yuki Ichihara and Yuu Jinnai and Tetsuro Morimura and Kenshi Abe and Kaito Ariu and Mitsuki Sakamoto and Eiji Uchibe}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2025}, url={https://openreview.net/forum?id=H4S4ETc8c9}, note={} }

Owner

Name: CyberAgent AI Lab
Login: CyberAgentAILab
Kind: organization
Location: Japan

Website: https://cyberagent.ai/ailab/
Twitter: cyberagent_ai
Repositories: 7
Profile: https://github.com/CyberAgentAILab

GitHub Events

Total

Watch event: 5
Push event: 1
Public event: 1

Last Year

Watch event: 5
Push event: 1
Public event: 1

Dependencies

requirements.txt pypi

POT *
absl-py *
accelerate *
bert_score *
bitsandbytes *
datasets *
einops *
evaluate *
google-cloud-storage *
llm-blender *
nltk ==3.8.1
peft *
py7zr *
rouge-score ==0.1.2
sacremoses ==0.0.53
scikit-learn-extra *
sortedcontainers *
subword-nmt *
torchmetrics *
tqdm *
transformers *
unbabel-comet *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cyberagentailab/srbon

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Stochastic Regularized Best-of-N Sampling (SRBoN)

Setup Instructions

Step 1: Environment Setup

Create virtual environment

Install dependencies

Step 2: Sample Collection and Metric Computation

Collect Samples

Compute Metrics

Compute log probability

Compute Wasserstein distance

Compute token length

Compute Reward Values

Step 3: Running SRBoN

Examples

Collect 100 samples from the model

Compute log probabilities for the collected samples

Compute Wasserstein distance for the samples

Compute the token length of the samples

Compute reward values using the OpenAssistant reward model

Compute reward values using the openbmb reward model

Reference

Owner

GitHub Events

Total

Last Year

Dependencies