https://github.com/camtice/sandbagdetect

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: camtice
Language: Jupyter Notebook
Default Branch: main
Size: 31.4 MB

Statistics

Stars: 6
Watchers: 3
Forks: 1
Open Issues: 1
Releases: 1

Created about 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme

README

The requirements.txt file lists all Python libraries that are required to run the various models. We specified the requirement for each model because sometimes packges are incompatible with each other.

Run

You only need to run the main script in the home directory and specify the settings in the 'config.py' file.

Settings

token_name (str): The authentication token for Hugging Face. (Sometimes required)
model_id (str): The model id of the model you want to test.
model_name (str): The name of the model you want to test. (required for system prompt)
datafilename (str): The name of the benchmark you want to test.
result_fname (str): The name of the file where the results should be saved.
percentages (list): The percentages of weights that should be tested.
noise_means (list): The means of the noise that should be tested.
noise_stds (list): The standard deviations of the noise that should be tested.
n_trials (int): The number of trials that should be run.
seeds (list or None): The seeds that should be used for the trials. If None random seeds are used.
n_prompts (int or None): The number of prompts between 0% and 100% sandbagging.
sample_method (str): The sampling method that should be used. Either "greedy" or "multinomial".

Advanced Settings for password locked models

These settings are special for the password locked models. Usually, you will need the standard settings.

PEFT (bool): True
adapter_id (str): None
bias (bool): False Forcing ABCD answers
password (str): None or any string that is the password for the model.

Other Settings

exclude_layers (list): The layers that should not receive any noise.
quiet (bool): True If you want to suppress the print outputs.
verbose (int): 1 - 4 The level of detail in the print outputs.

Contact

For questions reach out to Philipp Alexander Kreer philipp.a.kreer@outlook.de, Cameron Tice cwtice72@gmail.com, Prithvi Shahani victeldo@gmail.com, or... PUT HERE YOUR MAILS :)

Owner

Name: Cameron Tice
Login: camtice
Kind: user

Repositories: 1
Profile: https://github.com/camtice

Biomedical Sciences, Psychology, and Statistic student at Auburn University

GitHub Events

Total

Watch event: 1
Delete event: 18
Push event: 119
Pull request event: 4
Fork event: 1
Create event: 10

Last Year

Watch event: 1
Delete event: 18
Push event: 119
Pull request event: 4
Fork event: 1
Create event: 10

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science