On the Relationship Between Data Manifolds and Adversarial Examples
Abstract
In this work we study adversarial examples in deep neural networks through the lens of a predefined data manifold. By forcing certain geometric properties of this manifold, we are able to analyze the behavior of the learned decision boundaries. It has been shown previously that training to be robust against adversarial attacks produces models with gradients aligned to a small set of principal variations in the data. We demonstrate the converse of this statement; aligning model gradients with a select set of principal variations improves robustness against gradient based adversarial attacks. Our analysis shows that this also makes data more orthogonal to decision boundaries. We conclude that robust training methods make the problem better posed by focusing the model on more important dimensions of variation.
Cite
Text
Geyer et al. "On the Relationship Between Data Manifolds and Adversarial Examples." ICML 2023 Workshops: TAGML, 2023.Markdown
[Geyer et al. "On the Relationship Between Data Manifolds and Adversarial Examples." ICML 2023 Workshops: TAGML, 2023.](https://mlanthology.org/icmlw/2023/geyer2023icmlw-relationship/)BibTeX
@inproceedings{geyer2023icmlw-relationship,
title = {{On the Relationship Between Data Manifolds and Adversarial Examples}},
author = {Geyer, Michael and Bell, Brian Wesley and Fernandez, Amanda S and Moore, Juston},
booktitle = {ICML 2023 Workshops: TAGML},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/geyer2023icmlw-relationship/}
}