bias_bench

https://github.com/shomas383/bias_bench

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: shomas383
Language: Python
Default Branch: main
Size: 6.3 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created 11 months ago · Last pushed 11 months ago

Metadata Files

Readme Citation

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models

Nicholas Meade, Elinor Poole-Dayan, Siva Reddy

This repository contains the official source code for An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models presented at ACL 2022.

Bias Bench Leaderboard

For tracking progress on the intrinsic bias benchmarks evaluated in this work, we created Bias Bench. We plan to update Bias Bench in the future with additional bias benchmarks. To make a submission to Bias Bench, please contact nicholas.meade@mila.quebec.

Install

bash git clone https://github.com/mcgill-nlp/bias-bench.git cd bias-bench python -m pip install -e .

Required Datasets

Below, a list of the external datasets required by this repository is provided:

Dataset | Download Link | Notes | Download Directory --------|---------------|-------|------------------- Wikipedia-2.5 | Download | English Wikipedia dump used for SentenceDebias and INLP. | data/text Wikipedia-10 | Download | English Wikipedia dump used for CDA and Dropout. | data/text

Each dataset should be downloaded to the specified path, relative to the root directory of the project.

Experiments

We provide scripts for running all of the experiments presented in the paper. Generally, each script takes a --model argument and a --model_name_or_path argument. We briefly describe the script(s) for each experiment below:

CrowS-Pairs: Two scripts are provided for evaluating models against CrowS-Pairs: experiments/crows.py evaluates non-debiased models against CrowS-Pairs and experiments/crows_debias.py evaluates debiased models against CrowS-Pairs.
INLP Projection Matrix: experiments/inlp_projection_matrix.py is used to compute INLP projection matrices.
SEAT: Two scripts are provided for evaluating models against SEAT: experiments/seat.py evaluates non-debiased models against SEAT and experiments/seat_debias.py evaluates debiased models against SEAT.
StereoSet: Two scripts are provided for evaluating models against StereoSet: experiments/stereoset.py evaluates non-debiased models against StereoSet and experiments/stereoset_debias.py evaluates debiased models against StereoSet.
SentenceDebias Subspace: experiments/sentence_debias_subspace.py is used to compute SentenceDebias subspaces.
GLUE: experiments/run_glue.py is used to run the GLUE benchmark.
Perplexity: experiments/perplexity.py is used to compute perplexities on WikiText-2.

For a complete list of options for each experiment, run each experiment script with the --h option. For example usages of these experiment scripts, refer to batch_jobs. The commands used in batch_jobs produce the results presented in the paper.

Notes

To run SentenceDebias models against any of the benchmarks, you will first need to run experiments/sentence_debias_subspace.py.
To run INLP models against any of the benchmarks, you will first need to run experiments/inlp_projection_matrix.py.
export contains a collection of scripts to format the results into the tables presented in the paper.

(Own Note)

Run python stereoset_alt.py --model_name_or_path name_of_huggingface_model to create the results. It should be stored in the result/_name_of_model. After obtaining the result, run the python stereoset_evaluation.py --persistent_dir ../ --predictions_file /_location_of_results

Running on an HPC Cluster

We provide scripts for running all of the experiments presented in the paper on a SLURM cluster in batch_jobs. If you plan to use these scripts, make sure you customize python_job.sh to run the jobs on your cluster. In addition, you will also need to change both the output (-o) and error (-e) paths.

Acknowledgements

This repository makes use of code from the following repositories:

We thank the authors for making their code publicly available.

Citation

If you use the code in this repository, please cite the following paper:

@inproceedings{meade_2022_empirical,
    title = "An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models",
    author = "Meade, Nicholas  and Poole-Dayan, Elinor  and Reddy, Siva",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.132",
    doi = "10.18653/v1/2022.acl-long.132",
    pages = "1878--1898",
}

Owner

Login: shomas383
Kind: user

Repositories: 2
Profile: https://github.com/shomas383

Citation (CITATION.bib)

@inproceedings{meade_empirical_2022,
    title = "An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models",
    author = "Meade, Nicholas  and
      Poole-Dayan, Elinor  and
      Reddy, Siva",
    editor = "Muresan, Smaranda  and
      Nakov, Preslav  and
      Villavicencio, Aline",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.132",
    doi = "10.18653/v1/2022.acl-long.132",
    pages = "1878--1898",
}

GitHub Events

Total

Push event: 1
Create event: 1

Last Year

Push event: 1
Create event: 1

Dependencies

setup.py pypi

accelerate ==0.5.1
datasets ==1.18.3
nltk ==3.7.0
scikit-learn ==1.0.2
scipy ==1.7.3
torch ==1.10.2
transformers ==4.16.2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

bias_bench

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models

Bias Bench Leaderboard

Install

Required Datasets

Experiments

Notes

(Own Note)

Running on an HPC Cluster

Acknowledgements

Citation

Owner

Citation (CITATION.bib)

GitHub Events

Total

Last Year

Dependencies