Adversarial Training Is a Form of Data-Dependent Operator Norm Regularization

Abstract

We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we prove that $l_p$-norm constrained projected gradient ascent based adversarial training with an $l_q$-norm loss on the logits of clean and perturbed inputs is equivalent to data-dependent (p, q) operator norm regularization. This fundamental connection confirms the long-standing argument that a network’s sensitivity to adversarial examples is tied to its spectral properties and hints at novel ways to robustify and defend against adversarial attacks. We provide extensive empirical evidence on state-of-the-art network architectures to support our theoretical results.

PDF NeurIPS Semantic Scholar

Cite

Text

Roth et al. "Adversarial Training Is a Form of Data-Dependent Operator Norm Regularization." Neural Information Processing Systems, 2020.

Markdown

[Roth et al. "Adversarial Training Is a Form of Data-Dependent Operator Norm Regularization." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/roth2020neurips-adversarial/)

BibTeX

@inproceedings{roth2020neurips-adversarial,
  title     = {{Adversarial Training Is a Form of Data-Dependent Operator Norm Regularization}},
  author    = {Roth, Kevin and Kilcher, Yannic and Hofmann, Thomas},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/roth2020neurips-adversarial/}
}