Deep Defense: Training DNNs with Improved Adversarial Robustness

NeurIPS 2018 pp. 419-428

/neurips/2018/yan2018neurips-deep/

Abstract

Despite the efficacy on a variety of computer vision tasks, deep neural networks (DNNs) are vulnerable to adversarial attacks, limiting their applications in security-critical systems. Recent works have shown the possibility of generating imperceptibly perturbed image inputs (a.k.a., adversarial examples) to fool well-trained DNN classifiers into making arbitrary predictions. To address this problem, we propose a training recipe named "deep defense". Our core idea is to integrate an adversarial perturbation-based regularizer into the classification objective, such that the obtained models learn to resist potential attacks, directly and precisely. The whole optimization problem is solved just like training a recursive network. Experimental results demonstrate that our method outperforms training with adversarial/Parseval regularizations by large margins on various datasets (including MNIST, CIFAR-10 and ImageNet) and different DNN architectures. Code and models for reproducing our results are available at https://github.com/ZiangYan/deepdefense.pytorch.

PDF NeurIPS Semantic Scholar

Cite

Text

Yan et al. "Deep Defense: Training DNNs with Improved Adversarial Robustness." Neural Information Processing Systems, 2018.

Markdown

[Yan et al. "Deep Defense: Training DNNs with Improved Adversarial Robustness." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/yan2018neurips-deep/)

BibTeX

@inproceedings{yan2018neurips-deep,
  title     = {{Deep Defense: Training DNNs with Improved Adversarial Robustness}},
  author    = {Yan, Ziang and Guo, Yiwen and Zhang, Changshui},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {419-428},
  url       = {https://mlanthology.org/neurips/2018/yan2018neurips-deep/}
}