Skip to content

Zhihao151/MCL

Repository files navigation

Model-Contrastive Learning

This is an implementation demo of our paper Model-Contrastive Learning for Backdoor Elimination in PyTorch.

Python 3.8 Pytorch 1.9 CUDA 11.2 License CC BY-NC

MCL: Quick start with pretrained model

We have already uploaded the all2one pretrained backdoor model(i.e. gridTrigger WRN-16-1, target label 5).

For evaluating the performance of MCL, you can easily run command:

$ python main.py 

where the default parameters are shown in config.py.

The trained model will be saved at the path weight/<name>.tar

Please carefully read the main.py and configs.py, then change the parameters for your experiment.

Dataset Baseline ACC Baseline ASR MCLDef ACC MCLDef ASR
CIFAR-10 83.01 99.64 82.29 1.92

Training your own backdoored model

We have provided a DatasetBD Class in data_loader.py for generating training set of different backdoor attacks.

For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:

$ python train_badnet.py

This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper.

Please carefully read the train_badnet.py and configs.py, then change the parameters for your experiment.

Acknowledgements

Much of the code in this repository was adapted from code in this paper by Yige Li et al.

Other source of backdoor attacks

Attack

CL: Clean-label backdoor attacks

SIG: A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning

WaNet: WaNet-Imperceptible Warping-based Backdoor Attack.

Defense

Fine-tuning && Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks

I-BAU: Adversarial Unlearning of Backdoors via Implicit Hypergradient.

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

Library

Note: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.

BackdoorBox — is a Python toolbox for backdoor attacks and defenses.

Contacts

If you have any questions, leave a message below with GitHub.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages