Skip to content

kaidi-jin/backdoor_samples_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BE_detection

About

Code to the paper "A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models".

The model mutation method based on the code for adversarial sample detection.

Repo Structure

  • data: Training datasets and malicious data.
  • model: Trojaned Backdoor models.
  • injecting backdoor: To train the backdoor model.
  • attack: generate the adversarial example by CW attack and backdoor smaples.
  • model mutation: Model mutation methods to detect malicious examples.
  • utils: Model utils and data utils

Dependences

Our code is implemented and tested on Keras 2.2.4 with TensorFlow 1.12.0 backend, scipy==1.1.0 and the newest Cleverhans.

Quick Start

We have already injected the backdoor model and generated mutation model sets for detection test.

For the mnist adversarial samples detection:

 python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.3/ -t adv

For the mnist backdoor samples detection:

 python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.65/ -t backdoor

Models and Dataset

Traffic Sign Recognition

For the data, we reference from Neural Cleanse. You need to download the dataset from their repo and put the dataset file in the /data/gtsrb folder. For the backdoor model, we set the label '33' as our target label in the injection file.

Face Recognition Task

Original data from the office website. Our clean PubFig datasets on google drive. We provide a clean model, square infected model, and watermark infected model on Download Link. The square model infected by the square trigger and the watermark model infected by the watermark trigger. The backdoor target label is set as '0'. If you want to generate backdoor examples for face recognition task, please put the clean PubFig datasets on /data/face/ folder and refer to [keras_vggface]to train the model.(https://github.com/rcmalli/keras-vggface) for the dependece.

Useage

  1. Trojan model on inject folder with python injection_model.py -d mnist.

  2. Craft malicious examples on attack floder python cw_attack.py -d mnist. python generate_backdoor_samples.py -d mnist.

  3. On the model mutation folder Use Gaussian Fuzing to mutate the backdoor model (seed model). You can change the mutation rate in the gaussian_fuzzing file.

    python gaussian_fuzzing.py -d mnist
    

    Use the mutation models to detect malicious input.

    python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.3/ -t adv
    python SPRT_detector.py -d mnist -m mutation_model/mnist_mf_1.0_vf_0.65/ -t backdoor
    

Reference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages