GitHub - moinnadeem/StereoSet: StereoSet: Measuring stereotypical bias in pretrained language models

StereoSet: Measuring stereotypical bias in pretrained language models

This repository contains an extensible codebase to measure stereotypical bias on new pretrained models, as well as code to replicate our results. We encourage the community to use this as a springboard for further evaluation of bias in pretrained language models, and to submit attempts to mitigate bias to the leaderboard.

Note: This repository is currently not actively maintained. For updated code and the full test set, see the Bias Bench repository.

Installation

Clone the repository: git clone https://github.com/moinnadeem/stereoset.git
Install the requirements: cd stereoset && pip install -r requirements.txt

Reproducing Results

To reproduce our results for the bias in each model:

Run make from the code folder. This step evaluates the biases on each model.
Run the scoring script with respect to each model: python3 evaluation.py --gold-file ../data/dev.json --predictions-dir predictions/.

We have provided our predictions in the predictions/ folder, and the output of the evaluation script in predictions.txt. We have also included code to replicate our numbers on each table in the tables/ folder. Please feel free to file an issue if anything is off; we strongly believe in reproducible research and extensible codebases.

Citation

To cite StereoSet:

@misc{nadeem2020stereoset,
    title={StereoSet: Measuring stereotypical bias in pretrained language models},
    author={Moin Nadeem and Anna Bethke and Siva Reddy},
    year={2020},
    eprint={2004.09456},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
code		code
data		data
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

LICENSE.md

LICENSE.md

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

StereoSet: Measuring stereotypical bias in pretrained language models

Installation

Reproducing Results

Citation

About

Releases

Packages

Contributors 2

Languages

License

moinnadeem/StereoSet

Folders and files

Latest commit

History

Repository files navigation

StereoSet: Measuring stereotypical bias in pretrained language models

Installation

Reproducing Results

Citation

About

Resources

License

Stars

Watchers

Forks

Languages