BE_detection

About

Code to the paper "A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models".

The model mutation method based on the code for adversarial sample detection.

Repo Structure

data: Training datasets and malicious data.
model: Trojaned Backdoor models.
injecting backdoor: To train the backdoor model.
attack: generate the adversarial example by CW attack and backdoor smaples.
model mutation: Model mutation methods to detect malicious examples.
utils: Model utils and data utils

Dependences

Our code is implemented and tested on Keras 2.2.4 with TensorFlow 1.12.0 backend, scipy==1.1.0 and the newest Cleverhans.

Quick Start

We have already injected the backdoor model and generated mutation model sets for detection test.

For the mnist adversarial samples detection:

 python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.3/ -t adv

For the mnist backdoor samples detection:

 python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.65/ -t backdoor

Models and Dataset

Traffic Sign Recognition

For the data, we reference from Neural Cleanse. You need to download the dataset from their repo and put the dataset file in the /data/gtsrb folder. For the backdoor model, we set the label '33' as our target label in the injection file.

Face Recognition Task

Original data from the office website. Our clean PubFig datasets on google drive. We provide a clean model, square infected model, and watermark infected model on Download Link. The square model infected by the square trigger and the watermark model infected by the watermark trigger. The backdoor target label is set as '0'. If you want to generate backdoor examples for face recognition task, please put the clean PubFig datasets on /data/face/ folder and refer to [keras_vggface]to train the model.(https://github.com/rcmalli/keras-vggface) for the dependece.

Useage

Trojan model on inject folder with python injection_model.py -d mnist.
Craft malicious examples on attack floder python cw_attack.py -d mnist. python generate_backdoor_samples.py -d mnist.

On the model mutation folder Use Gaussian Fuzing to mutate the backdoor model (seed model). You can change the mutation rate in the gaussian_fuzzing file.

python gaussian_fuzzing.py -d mnist

Use the mutation models to detect malicious input.

python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.3/ -t adv
python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.65/ -t backdoor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attack

attack

data

data

injection

injection

model

model

model_mutation

model_mutation

utils

utils

README.md

README.md

Repository files navigation

BE_detection

About

Repo Structure

Dependences

Quick Start

Models and Dataset

Traffic Sign Recognition

Face Recognition Task

Useage

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
attack		attack
data		data
injection		injection
model		model
model_mutation		model_mutation
utils		utils
README.md		README.md

kaidi-jin/backdoor_samples_detection

Folders and files

Latest commit

History

Repository files navigation

BE_detection

About

Repo Structure

Dependences

Quick Start

Models and Dataset

Traffic Sign Recognition

Face Recognition Task

Useage

Reference

About

Resources

Stars

Watchers

Forks

Languages