Aeva-Blackbox-Backdoor-Detection (ICLR 2022)

The code for: Detecting Backdoors in Black-box Neural Networks via Adversarial Extreme Value Analysis (ICLR 2022)

BEFORE YOU RUN THIS CODE

We appreciate your interest in AEVA: Detecting Backdoors in Black-box Neural Networks via Adversarial Extreme Value Analysis and use our code.

The detailed instructions are shown below.

DEPENDENCIES

Our code is implemented using Keras with TensorFlow backend. Following packages are required.

keras==2.2.4

tensorflow-gpu==2.3.1

Our code is tested on Python 3.8.3

We include a sample script demonstrating how to perform Aeva on an backdoor-infected model. There are several parameters that need to be set before running the code:

On GPU device: if you are using GPU, specify which GPU you would like to use by setting the DEVICE variable via

os.environ["CUDA_VISIBLE_DEVICES"] = "0"

If you want to test the code on your own models, please specify the path to the model varialble in detect_main.py.

Results in the Paper

To understand the performance of AEVA on backdoor-infected model with a single infected label, we here use CIFAR-10 and CIFAR-100 two models (ResNet) to show the peformance of AEVA, since CIFAR-10 and CIFAR-100 datasets can be easily accessed in Keras. Specifically, for each task, we store a backdoor-infected model in the "saved_models" package. Each model is embedded with a 4x4 backdoor square.

For example, you can access CIFAR-10 model throuhg:

import tensorflow.keras as keras
model=keras.models.load_model("saved_models/cifar10_backdoor.h5")

Notably, we preprocess each sample into [0,1].

To obtain the adversarial perturbations for CIFAR-10 using AEVA, you can run:

python -u detect_main.py --sp 0 --ep 10 --cuda 0 --task cifar10

Similarly, for CIFAR-100 :

python -u detect_main.py --sp 0 --ep 100 --cuda 0 --task cifar100

You can also investigate the impact for the number of the infected label through modifying "--sp" and "--ep". The default number of samples used for AEVA is 40.

The adversarial perturbations are stored in the "cifar10_adv_per" or "cifar100_adv_per" two packages.

To further calculate the anomaly index for each label, we can run after setting the path for stored adversarial perturbations:

python outlier.py

The results will give us the Global Adversarial Peak(GAP) and anomaly index for each label. For example:

Global Adversarial Peak(GAP): [0.14533126 0.08231069 0.06113102 0.06526258 0.06199296 0.06474132
 0.05325177 0.06903553 0.09275569 0.07373303]
Anomaly Index: [ 8.36901293  1.62297756 -0.64420006 -0.20193759 -0.55193415 -0.25773579
 -1.48763492  0.20193759  2.7410618   0.70478146]

From above results, we can find that the first label deemed as the infected (i.e., anomaly index larger than 4).

Results for Dynamic Backdoor triggers

Due to the space constraint, we include four models infected with dynamic and non-compact backdoor triggers in "saved_models/Dynamic_backdoor_model/". We upload remaining 6 backdoor infected models in https://drive.google.com/drive/folders/1D2om8FlYt9T6tZ8Lu81l34lAWRy04d0s?usp=sharing. The results for one of ResNet-44 model are shown as:

Global Adversarial Peak(GAP):[0.36433247 0.16292 0.18040664 0.20698954 0.20037229 0.16336469
 0.17756703 0.17803021 0.16818275 0.17449666]
Anomaly Index:[10.46289861 -0.8345591   0.14628692  1.63735197  1.26618296 -0.80961594
 -0.01299003  0.01299003 -0.53936558 -0.18521095]

Citation

If you find our implementaion is helpful, please cite:

@inproceedings{
guo2022aeva,
title={{AEVA}: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis},
author={Junfeng Guo and Ang Li and Cong Liu},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=OM_lYiHXiCL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
GradEst		GradEst
saved_models		saved_models
README.md		README.md
a.png		a.png
backdoor.png		backdoor.png
detect_main.py		detect_main.py
framework.png		framework.png
outlier.py		outlier.py
owtf.pdf		owtf.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GradEst

GradEst

saved_models

saved_models

README.md

README.md

a.png

a.png

backdoor.png

backdoor.png

detect_main.py

detect_main.py

framework.png

framework.png

outlier.py

outlier.py

owtf.pdf

owtf.pdf

Repository files navigation

Aeva-Blackbox-Backdoor-Detection (ICLR 2022)

BEFORE YOU RUN THIS CODE

DEPENDENCIES

Results in the Paper

Results for Dynamic Backdoor triggers

Citation

About

Releases

Packages

Languages

JunfengGo/AEVA-Blackbox-Backdoor-Detection-main

Folders and files

Latest commit

History

Repository files navigation

Aeva-Blackbox-Backdoor-Detection (ICLR 2022)

BEFORE YOU RUN THIS CODE

DEPENDENCIES

Results in the Paper

Results for Dynamic Backdoor triggers

Citation

About

Resources

Stars

Watchers

Forks

Languages