MMA Training: Direct Input Space Margin Maximization Through Adversarial Training

Abstract

We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier's decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the "shortest successful perturbation", demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed $\epsilon$, MMA offers an improvement by enabling adaptive selection of the "correct" $\epsilon$ as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training's efficacy on the MNIST and CIFAR10 datasets w.r.t. $\ell_\infty$ and $\ell_2$ robustness.

Cite

Text

Ding et al. "MMA Training: Direct Input Space Margin Maximization Through Adversarial Training." International Conference on Learning Representations, 2020.

Markdown

[Ding et al. "MMA Training: Direct Input Space Margin Maximization Through Adversarial Training." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/ding2020iclr-mma/)

BibTeX

@inproceedings{ding2020iclr-mma,
  title     = {{MMA Training: Direct Input Space Margin Maximization Through Adversarial Training}},
  author    = {Ding, Gavin Weiguang and Sharma, Yash and Lui, Kry Yik Chau and Huang, Ruitong},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/ding2020iclr-mma/}
}