The Effects of Ensembling on Long-Tailed Data

Code for the paper "The Effects of Ensembling on Long-Tailed Data" where we perform a systematic comparison between logit and probability ensembling for a variety of models trained on balanced and imbalanced datasets.

Findings:

Adding more ensemble members continues to improve performance on imbalanced datasets.
No difference between logit and probability ensembles across a variety of balanced datasets.
There are differences between logit and probability ensembles on imbalanced datasets depending on the ensemble diversity and dependency.

@inproceedings{
buchanan2023the,
title={The Effects of Ensembling on Long-Tailed Data},
author={E. Kelly Buchanan and Geoff Pleiss and Yixin Wang and John Patrick Cunningham},
booktitle={NeurIPS 2023 Workshop Heavy Tails in Machine Learning},
year={2023}
}

Installation instructions in docs/README.md: docs/README.md

Experiments:

Train resnet32 model on CIFAR10 dataset

python scripts/run.py --config-name="run_gpu_cifar10"

Train models on CIFAR10LT dataset across multiple losses

wandb sweep experiments/compare_loss/train_gpu_loss_cifar10.yaml

Train additional models on CIFAR10LT.

wandb sweep experiments/compare_loss/train_gpu_loss_cifar10_largeM.yaml

Paper Experiments

Wandb Experiment	parameters	comments
nggmmw4m , 0itowy8a, d4s9wp4v	train resnet32 and resnet110 models on CIFAR10-LT using multiple losses and for different seeds. (IMBALANCECIFAR10)	models trained using balanced softmax loss have best performance
9hwaytks, gv4bucon	train resnet32_cfa and resnet_110 on CIFAR100-LT using multiple losses and for difference seeds. (IMBALANCECIFAR100Aug)	models trained using balanced softmax loss have best performance

Reproduce paper tables and figures:

Fig: Ensemble size vs ensemble type across multiple losses

python scripts/vis_scripts/plot_results_metric_M.py --config-path="../../results/configs/comparison_baseline_cifar10lt" --config-name="compare_M"

Table: Ensemble performance of models trained on CIFAR10-LT and CIFAR100-LT:

python scripts/compare_all_results.py --config-path="../results/configs/comparison_baseline_cifar10lt" --config-name="default"
python scripts/compare_all_results.py --config-path="../results/configs/comparison_baseline_cifar100lt" --config-name="default"

Fig: Class ID vs avg. Disagreement:

python scripts/vis_scripts/plot_results_pclass.py

Fig: Class ID vs diversity/dependency:

python scripts/vis_scripts/plot_results_dkl_diff.py

Fig: performance of logit and probability ensembles on balanced datasets.

python scripts/vis_scripts/plot_single_metric_xy.py --datasets=base --metric=error

References:

Balanced Meta Softmax: github.com/jiawei-ren/BalancedMetaSoftmax-Classification

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
docs		docs
experiments/compare_loss		experiments/compare_loss
results		results
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

experiments/compare_loss

experiments/compare_loss

results

results

scripts

scripts

src

src

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

setup.py

setup.py

Repository files navigation

The Effects of Ensembling on Long-Tailed Data

Findings:

Experiments:

Paper Experiments

Reproduce paper tables and figures:

References:

About

Releases

Packages

Languages

License

ekellbuch/longtail_ensembles

Folders and files

Latest commit

History

Repository files navigation

The Effects of Ensembling on Long-Tailed Data

Findings:

Experiments:

Paper Experiments

Reproduce paper tables and figures:

References:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages