Over-Parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Abstract

Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. In practice adversarial training leads to low robust training loss. However, a rigorous explanation for why this happens under natural conditions is still missing. Recently a convergence theory of standard (non-adversarial) supervised training was developed by various groups for {\em very overparametrized} nets. It is unclear how to extend these results to adversarial training because of the min-max objective. Recently, a first step towards this direction was made by Gao et al. using tools from online learning, but they require the width of the net to be \emph{exponential} in input dimension $d$, and with an unnatural activation function. Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with ReLU activations. A key element of our proof is showing that ReLU networks near initialization can approximate the step function, which may be of independent interest.

Cite

Text

Zhang et al. "Over-Parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality." Neural Information Processing Systems, 2020.

Markdown

[Zhang et al. "Over-Parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/zhang2020neurips-overparameterized/)

BibTeX

@inproceedings{zhang2020neurips-overparameterized,
  title     = {{Over-Parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality}},
  author    = {Zhang, Yi and Plevrakis, Orestis and Du, Simon S and Li, Xingguo and Song, Zhao and Arora, Sanjeev},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/zhang2020neurips-overparameterized/}
}