The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models

This repository is the official implementation of "The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models" (under submission at NeurIPS 2021). It includes:

Training code for FairFace and X-ray classifiers
Code for running the spotlight (inference passes and spotlight optimizer)
Analysis notebooks used to visualize results in paper

Requirements

Packages

To install requirements for training and running spotlights:

pip install -r requirements.txt

For analysis notebooks, we used Singularity to run the scipy-notebook Jupyter Docker stack.

Datasets

Our experiments use the following datasets. Set the environment variable DATA_DIR appropriately:

$DATA_DIR/fairface: FairFace, using the padding=0.25 version of the dataset
$DATA_DIR/imagenet: ImageNet
$DATA_DIR/amazon: Amazon Polarity
$DATA_DIR/squad: SQuAD
$DATA_DIR/movielens: MovieLens 100k, from Graham, Hartford et al.'s implementation of DeepSet
$DATA_DIR/xray: X-ray

Training Scripts

For two of the domains in the paper, we train classifiers using standard architectures and training methods. These scripts assume that DATA_DIR and MODEL_DIR have been set appropriately:

FairFace:

python train_fairface.py --checkpoint_dir $MODEL_DIR/fairface

X-ray:

python train_xray.py

Inference

We include inference scripts for each model, saving final-layer embeddings along with model outputs and losses:

inference_fairface.py (FairFace)
inference_imagenet.py (ImageNet)
inference_amazon.py (Amazon Polarity)
inference_squad.py (SQuAD)
inference_movielens.py (MovieLens)
inference_xray.py (X-ray)

Spotlights

The spotlight is implemented as a command-line utility in spotlight/run_spotlight.py. The specific commands that we ran in our experiments are listed in:

spotlights_fairface.sh (FairFace)
spotlights_imagenet.sh (ImageNet)
spotlights_amazon.sh (Amazon Polarity)
spotlights_squad.sh (SQuAD)
spotlights_movielens.sh (MovieLens)
spotlights_xray.sh (X-ray)

Analysis

The results shown in our paper are produced by analyzing examples in each dataset that are given high weights by the spotlights. We include our spotlight weights in spotlight_outputs/, and Jupyter notebooks to visualize these results in analysis.ipynb and analysis_nlp.ipynb (for image/recommender systems and NLP models, respectively).

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
spotlight_results		spotlight_results
torch_spotlight		torch_spotlight
analysis.ipynb		analysis.ipynb
analysis_iwilds.ipynb		analysis_iwilds.ipynb
analysis_nlp.ipynb		analysis_nlp.ipynb
analysis_waterbirds.ipynb		analysis_waterbirds.ipynb
generate_job_script_iwilds.ipynb		generate_job_script_iwilds.ipynb
inference_amazon.py		inference_amazon.py
inference_fairface.py		inference_fairface.py
inference_imagenet.py		inference_imagenet.py
inference_iwild_test_id.py		inference_iwild_test_id.py
inference_iwild_test_ood.py		inference_iwild_test_ood.py
inference_movielens.py		inference_movielens.py
inference_squad.py		inference_squad.py
inference_waterbirds_test.py		inference_waterbirds_test.py
inference_xray.py		inference_xray.py
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py
spotlights_amazon.sh		spotlights_amazon.sh
spotlights_fairface.sh		spotlights_fairface.sh
spotlights_imagenet.sh		spotlights_imagenet.sh
spotlights_movielens.sh		spotlights_movielens.sh
spotlights_squad.sh		spotlights_squad.sh
spotlights_xray.sh		spotlights_xray.sh
train_fairface.py		train_fairface.py
train_xray.py		train_xray.py

gregdeon/spotlight

Folders and files

Latest commit

History

Repository files navigation

The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models

Requirements

Packages

Datasets

Training Scripts

Inference

Spotlights

Analysis

About

Resources

Stars

Watchers

Forks

Languages