Improving Adversarial Robustness by Penalizing Natural Accuracy

Chandna, Kshitij

doi:10.1007/978-3-031-25056-9_33

Improving Adversarial Robustness by Penalizing Natural Accuracy

Kshitij Chandna

ECCVW 2022 pp. 517-533

doi:10.1007/978-3-031-25056-9_33 /eccvw/2022/chandna2022eccvw-improving/

Abstract

Current techniques in deep learning are still unable to train adversarially robust classifiers which perform as well as non-robust ones. In this work, we continue to study the space of loss functions , and show that the choice of loss can affect robustness in highly nonintuitive ways. Specifically, we demonstrate that a surprising choice of loss function can in fact improve adversarial robustness against some attacks. Our loss function encourages accuracy on adversarial examples, and explicitly penalizes accuracy on natural examples. This is inspired by the theoretical and empirical works suggesting a fundamental tradeoff between standard accuracy and adversarial robustness. Our method, NAturally Penalized (NAP) loss, achieves 61.5% robust accuracy on CIFAR-10 with $\varepsilon =8/255$ ε = 8 / 255 perturbations in $\ell _\infty $ ℓ ∞ (against a PGD-60 adversary with 20 random restarts). This improves over the standard PGD defense by over 3%, against other loss functions proposed in the literature. Although TRADES performs better on CIFAR-10 against Auto-Attack, our approach gets better results on CIFAR-100. Our results thus suggest that significant robustness gains are possible by revisiting training techniques, even without additional data.

PDF ECCVW Semantic Scholar

Cite

Text

Chandna. "Improving Adversarial Robustness by Penalizing Natural Accuracy." European Conference on Computer Vision Workshops, 2022. doi:10.1007/978-3-031-25056-9_33

Markdown

[Chandna. "Improving Adversarial Robustness by Penalizing Natural Accuracy." European Conference on Computer Vision Workshops, 2022.](https://mlanthology.org/eccvw/2022/chandna2022eccvw-improving/) doi:10.1007/978-3-031-25056-9_33

BibTeX

@inproceedings{chandna2022eccvw-improving,
  title     = {{Improving Adversarial Robustness by Penalizing Natural Accuracy}},
  author    = {Chandna, Kshitij},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2022},
  pages     = {517-533},
  doi       = {10.1007/978-3-031-25056-9_33},
  url       = {https://mlanthology.org/eccvw/2022/chandna2022eccvw-improving/}
}