GitHub - rojinakashefi/Adversarial-Robustness: Learning Adversarial Robustness in Machine Learning both Theory and Practice.

This Repository is about Adversarial and Robustness in Machine Learning.

Presented on NeurlPS 2018 by J. Z. Kolter and A. Madry.

Basic Attack: by maximizing loss of correct class label. (FGSM method: gradient of loss function with respect to perturbation and clip it to the bounding area)
Target Attack: by maximizing loss of correct class label and minimizing loss of target class label.
Binary Classification using Linear models on MNIST
- Basic model after 10 epochs:
error rate of 0.0004 means 1 wrong in test set.
- Noise created for Linear models:
Vertical line (like a 1) in black pixels, and a cirlce (like a 0) in in white. The intuition here is that moving in the black direction, we make the classifier think the image is more like a 1, while moving in the white direction, more like a 0.
- Model error rate:
From 0.0004 error rate to 82.8% error rate.
- Training Robust Classifier:
- Adversarial Training Set:
No adversarial attack can lead to more then 2.5% error on the test set.
- Non-adversarial Training Set:
  
  We’re getting 0.3% error on the test set. This is good, but not as good as we were doing with standard training; we’re now making 8 mistakes on the test set, instead of the 1 that we were making before.
  
  Trade off between clean accuracy and robust accuracy, and doing better on the robust error leads to higher clean error.
- Optimal perturbation for this robust model:
Neural Networks
1. Solving inner maximization problem
  1. Lower bounding techniques:
    1. FGSM : which takes gradient of loss function with respect to perturbation
      
      Constructing adversarial examples usign FGSM on Conv2D mode.
      
      Error rate FGSM:
    2. Projected Gradient Descent:
    3. Steepest Descent:
    4. Randomization:
      
      Error rate with randomization:
    5. Targeted Attack
      
      Target attack = 2 (The actual 2 is unchanged, because the loss function in this case is always exactly zero)
      
      Target attack = 0 (we are maximizing is the class logit for the zero minus the class logit for the true class. But we don’t actually care what happens to the other classes, and in some cases, the best way to make the class 0 logit high is to make another class logit even higher.)
    6. Targeted Attack (minimizing all other classes)
    7. Non-ℓ∞ norms
  2. Exactly solving
    1. Mixed integer formulation
    2. Finding upper bound and lower bound
    3. Final integer programming formulation
    4. Certifying robustness
  3. Upper bounding technique
    1. Convex relaxation
    2. Interval-propagation-based bounds
2. Solving outer minimization problem
  1. Adversarial training with adversarial examples
  2. Relaxation-based robust training
  3. Training using provable criteria

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
images		images
Adversarial Robustness.pdf		Adversarial Robustness.pdf
Adversarial_tutorial.ipynb		Adversarial_tutorial.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

Adversarial Robustness.pdf

Adversarial Robustness.pdf

Adversarial_tutorial.ipynb

Adversarial_tutorial.ipynb

README.md

README.md

Repository files navigation

About

Releases

Languages

rojinakashefi/Adversarial-Robustness

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Languages