Consistency Regularization for Certified Robustness of Smoothed Classifiers

Abstract

A recent technique of randomized smoothing has shown that the worst-case (adversarial) l2-robustness can be transformed into the average-case Gaussian-robustness by "smoothing" a classifier, i.e., by considering the averaged prediction over Gaussian noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the "certified" l2-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.

Cite

Text

Jeong and Shin. "Consistency Regularization for Certified Robustness of Smoothed Classifiers." Neural Information Processing Systems, 2020.

Markdown

[Jeong and Shin. "Consistency Regularization for Certified Robustness of Smoothed Classifiers." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/jeong2020neurips-consistency/)

BibTeX

@inproceedings{jeong2020neurips-consistency,
  title     = {{Consistency Regularization for Certified Robustness of Smoothed Classifiers}},
  author    = {Jeong, Jongheon and Shin, Jinwoo},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/jeong2020neurips-consistency/}
}