Skip to content
/ trapdoor Public
forked from Shawn-Shan/trapdoor

Code Implementation for Gotta Catch ’Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

Notifications You must be signed in to change notification settings

kuszz/trapdoor

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 

Repository files navigation

Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

ABOUT

This repository contains code implementation of the paper "Gotta Catch’Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks", at ACM CCS 2020. The slides are here. Trapdoor is a defense against adversarial attack developed by researchers at SANDLab, University of Chicago.

Note

The code base currently only support MNIST dataset (other dataset support coming soon) and two different attacks, CW and PGD.

More code will be released before the paper is published in Nov 2020.

DEPENDENCIES

Our code is implemented and tested on Keras with TensorFlow backend. Following packages are used by our code.

  • keras==2.3.1
  • numpy==1.16.4
  • tensorflow-gpu==1.14.1

Our code is tested on Python 3.6.8

How to train a trapdoored model

There is a pretrained CIFAR trapdoored model in ./model and the trapdoors injected are in ./results

If you would like to train a new model or change the setup:

python3 inject_trapdoor.py --dataset mnist

python3 inject_trapdoor.py --dataset cifar

How to run attack and detection:

Given a trapdoored model in ./model and pattern stored in ./results. Run:

python3 eval_detection.py --dataset mnist --attack pgd

python3 eval_detection.py --dataset cifar --attack pgd

Make sure to change the MODEL_PATH, RES_PATH when running the code on customized models.

The code will run targeted PGD attack on 3 randomly selected label. It will print out the AUC of detection and the attack success rate at 5% FPR.

To randomize the neuron matching process as we discussed in Section 7.2:

python3 eval_detection.py --dataset mnist --filter-ratio 0.1

Citation

@inproceedings{shan2020gotta,
  title={Gotta catch 'em all: Using honeypots to catch adversarial attacks on neural networks},
  author={Shan, Shawn and Wenger, Emily and Wang, Bolun and Li, Bo and Zheng, Haitao and Zhao, Ben Y},
  journal={Proc. of CCS},
  year={2020}
}

About

Code Implementation for Gotta Catch ’Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%