Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks
Abstract
We propose new, more efficient targeted white-box attacks against deep neural networks. Our attacks better align with the attacker’s goal: (1) tricking a model to assign higher probability to the target class than to any other class, while (2) staying within an $\epsilon$-distance of the attacked input. First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it. Second, we propose a new attack method, Constrained Gradient Descent (CGD), using a refinement of our loss function that captures both (1) and (2). CGD seeks to satisfy both attacker objectives—misclassification and bounded $\ell_{p}$-norm—in a principled manner, as part of the optimization, instead of via ad hoc post-processing techniques (e.g., projection or clipping). We show that CGD is more successful on CIFAR10 (0.9–4.2%) and ImageNet (8.6–13.6%) than state-of-the-art attacks while consuming less time (11.4–18.8%). Statistical tests confirm that our attack outperforms others against leading defenses on different datasets and values of $\epsilon$.
Cite
Text
Lin et al. "Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks." International Conference on Machine Learning, 2022.Markdown
[Lin et al. "Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/lin2022icml-constrained/)BibTeX
@inproceedings{lin2022icml-constrained,
title = {{Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks}},
author = {Lin, Weiran and Lucas, Keane and Bauer, Lujo and Reiter, Michael K. and Sharif, Mahmood},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {13405-13430},
volume = {162},
url = {https://mlanthology.org/icml/2022/lin2022icml-constrained/}
}