Regularizer to Mitigate Gradient Masking Effect During Single-Step Adversarial Training

Abstract

Neural networks are susceptible to adversarial samples: samples with imperceptible noise, crafted to manipulate network's prediction. In order to learn robust models, a training procedure, called Adversarial Training has been introduced. During adversarial training, models are trained with mini-batch containing adversarial samples. In order to scale adversarial training for large datasets and networks, fast and simple methods (e.g., FGSM:Fast Gradient Sign Method) of generating adversarial samples are used while training. It has been shown that models trained using single-step adversarial training methods (i.e., adversarial samples generated using non-iterative methods such as FGSM) are not robust, instead they learn to generate weaker adversaries by masking the gradients. In this work, we propose a regularization term in the training loss, to mitigate the effect of gradient masking during single-step adversarial training. The proposed regularization term causes training loss to increase when the distance between logits (i.e., pre-softmax output of a classifier) for FGSM and R-FGSM (small random noise is added to the clean sample before computing its FGSM sample) adversaries of a clean sample becomes large. The proposed single-step adversarial training is faster than computationally expensive state-of-the-art PGD adversarial training method, and also achieves on par results.

Cite

Text

Vivek et al. "Regularizer to Mitigate Gradient Masking Effect During Single-Step Adversarial Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019. doi:10.1109/CVPRW.2019.00014

Markdown

[Vivek et al. "Regularizer to Mitigate Gradient Masking Effect During Single-Step Adversarial Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/s2019cvprw-regularizer/) doi:10.1109/CVPRW.2019.00014

BibTeX

@inproceedings{s2019cvprw-regularizer,
  title     = {{Regularizer to Mitigate Gradient Masking Effect During Single-Step Adversarial Training}},
  author    = {Vivek, B. S. and Baburaj, Arya and Babu, R. Venkatesh},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2019},
  pages     = {66-73},
  doi       = {10.1109/CVPRW.2019.00014},
  url       = {https://mlanthology.org/cvprw/2019/s2019cvprw-regularizer/}
}