Helper-Based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-Off

Abstract

While adversarial training has become the de facto approach for training robust classifiers, it leads to a drop in accuracy. This has led to prior works postulating that accuracy is inherently at odds with robustness. Yet, the phenomenon remains inexplicable. In this paper, we closely examine the changes induced in the decision boundary of a deep network during adversarial training. We find that adversarial training leads to unwarranted increase in the margin along certain adversarial directions, thereby hurting accuracy. Motivated by this observation, we present a novel algorithm, called Helper-based Adversarial Training (HAT), to reduce this effect by incorporating additional wrongly labelled examples during training. Our proposed method provides a notable improvement in accuracy without compromising robustness. It achieves a better trade-off between accuracy and robustness in comparison to existing defenses.

Cite

Text

Rade and Moosavi-Dezfooli. "Helper-Based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-Off." ICML 2021 Workshops: AML, 2021.

Markdown

[Rade and Moosavi-Dezfooli. "Helper-Based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-Off." ICML 2021 Workshops: AML, 2021.](https://mlanthology.org/icmlw/2021/rade2021icmlw-helperbased/)

BibTeX

@inproceedings{rade2021icmlw-helperbased,
  title     = {{Helper-Based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-Off}},
  author    = {Rade, Rahul and Moosavi-Dezfooli, Seyed-Mohsen},
  booktitle = {ICML 2021 Workshops: AML},
  year      = {2021},
  url       = {https://mlanthology.org/icmlw/2021/rade2021icmlw-helperbased/}
}