Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Linear Subspaces

Abstract

Despite a great deal of research, it is still not well-understood why trained neural networks are highly vulnerable to adversarial examples.In this work we focus on two-layer neural networks trained using data which lie on a low dimensional linear subspace.We show that standard gradient methods lead to non-robust neural networks, namely, networks which have large gradients in directions orthogonal to the data subspace, and are susceptible to small adversarial $L_2$-perturbations in these directions.Moreover, we show that decreasing the initialization scale of the training algorithm, or adding $L_2$ regularization, can make the trained network more robust to adversarial perturbations orthogonal to the data.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Melamed et al. "Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Linear Subspaces." Neural Information Processing Systems, 2023.

Markdown

[Melamed et al. "Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Linear Subspaces." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/melamed2023neurips-adversarial/)

BibTeX

@inproceedings{melamed2023neurips-adversarial,
  title     = {{Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Linear Subspaces}},
  author    = {Melamed, Odelia and Yehudai, Gilad and Vardi, Gal},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/melamed2023neurips-adversarial/}
}