Learning to Generate Noise for Multi-Attack Robustness

This is the Pytorch Implementation for the paper Learning to Generate Noise for Multi-Attack Robustness

Authors: Divyam Madaan, Jinwoo Shin, Sung Ju Hwang

Abstract

Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations. However, the majority of existing defense methods are tailored to defend against a single category of adversarial perturbation (e.g. $\ell_\infty$-attack). In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system. Moreover, training on multiple perturbations simultaneously significantly increases the computational overhead during training. To address these challenges, we propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks. Its key component is Meta Noise Generator (MNG) that outputs optimal noise to stochastically perturb a given sample, such that it helps lower the error on diverse adversarial perturbations. By utilizing samples generated by MNG, we train a model by enforcing the label consistency across multiple perturbations. We validate the robustness of models trained by our scheme on various datasets and against a wide variety of perturbations, demonstrating that it significantly outperforms the baselines across multiple perturbations with a marginal computational cost.

Contribution of this work

We introduce Adversarial Consistency (AC) loss that enforces label consistency across multiple perturbations to enforce smooth and robust networks.
We formulate Meta-Noise Generator (MNG) that explicitly meta-learns an input-dependent noise generator, such that it outputs stochastic noise distribution to improve the model's robustness and adversarial consistency across multiple types of adversarial perturbations.
We validate our proposed method on various datasets against diverse benchmark adversarial attacks, on which it achieves state-of-the-art performance, highlighting its practical impact.

Prerequisites

$ pip install -r requirements.txt

For RST, the data can be obtained from here

Run

CIFAR-10 experiment


# Meta Noise Generator with Adversarial Consistency and RST
$ python train_MNG.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet --rst=True

# Meta Noise Generator with Adversarial Consistency
$ python train_MNG.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet 

# Stochastic Adversarial Training
$ python train_pgd.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet --attack_type random

# Evaluation
## PGD attacks
$ python evaluate.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet --attack_lib custom --norm linf

## Foolbox attacks
$ python evaluate.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet --attack_lib foolbox

## Autoattack attacks
$ python evaluate.py --fname MNG_cifar10 --dataset cifar10 --model WideResNet --attack_lib autoattack --norm linf

SVHN experiment


# Meta Noise Generator with Adversarial Consistency and RST
$ python train_MNG.py --fname MNG_svhn --dataset svhn --model WideResNet --rst=True

# Meta Noise Generator with Adversarial Consistency
$ python train_MNG.py --fname MNG_svhn --dataset svhn --model WideResNet 

# Stochastic Adversarial Training
$ python train_pgd.py --fname MNG_svhn --dataset svhn --model WideResNet --attack_type random

# Evaluation
## PGD attacks
$ python evaluate.py --fname MNG_svhn --dataset svhn --model WideResNet --attack_lib custom --norm linf

## Foolbox attacks
$ python evaluate.py --fname MNG_svhn --dataset svhn --model WideResNet --attack_lib foolbox

## Autoattack attacks
$ python evaluate.py --fname MNG_svhn --dataset svhn --model WideResNet --attack_lib autoattack --norm linf

Tiny-ImageNet experiment


# Meta Noise Generator with Adversarial Consistency
$ python train_MNG.py --fname MNG_tinyimagenet --dataset tinyimagenet --model resnet50


# Stochastic Adversarial Training
$ python train_pgd.py --fname MNG_tinyimagenet --dataset tinyimagenet --model resnet50 --attack_type random

# Evaluation
## PGD attacks
$ python evaluate.py --fname MNG_tinyimagenet --dataset tinyimagenet --model resnet50 --attack_lib custom --norm linf

## Foolbox attacks
$ python evaluate.py --fname MNG_tinyimagenet --dataset tinyimagenet --model resnet50 --attack_lib foolbox

## Autoattack attacks
$ python evaluate.py --fname MNG_tinyimagenet --dataset tinyimagenet --model resnet50 --attack_lib autoattack --norm linf

Pretrained models

Dataset	Architecture	Average	Max	MSD	MNG-AC	MNG-AC + RST
CIFAR10	WideResNet 28-10	ckpt	ckpt	ckpt	ckpt	ckpt
SVHN	WideResNet 28-10	ckpt	ckpt	ckpt	ckpt	ckpt
Tiny-ImageNet	ResNet50	ckpt	ckpt	ckpt	ckpt	—

Contributing

We'd love to accept your contributions to this project. Please feel free to open an issue, or submit a pull request as necessary. If you have implementations of this repository in other ML frameworks, please reach out so we may highlight them here.

Acknowledgment

The code is build upon locuslab/fast_adversarial and locuslab/robust_union

Citation

If you found the provided code useful, please cite our work.

@inproceedings{
    madaan2021learning,
    title={Learning to Generate Noise for Multi-Attack Robustness},
    author={Divyam Madaan and Jinwoo Shin and Sung Ju Hwang},
    booktitle={International Conference on Machine Learning},
    year={2021}
    url = "https://arxiv.org/abs/2006.12135"
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
concept.png		concept.png
core.py		core.py
datasets.py		datasets.py
evaluate.py		evaluate.py
layers.py		layers.py
losses.py		losses.py
preact_resnet.py		preact_resnet.py
requirements.txt		requirements.txt
torch_backend.py		torch_backend.py
train_MNG.py		train_MNG.py
train_pgd.py		train_pgd.py
train_semi.py		train_semi.py
wideresnet.py		wideresnet.py

License

divyam3897/MNG_AC

Folders and files

Latest commit

History

Repository files navigation

Learning to Generate Noise for Multi-Attack Robustness

Abstract

Prerequisites

Run

Pretrained models

Contributing

Acknowledgment

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages