Consistency Regularization for Certified Robustness of Smoothed Classifiers

Abstract

A recent technique of randomized smoothing has shown that the worst-case (adversarial) l2-robustness can be transformed into the average-case Gaussian-robustness by "smoothing" a classifier, i.e., by considering the averaged prediction over Gaussian noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the "certified" l2-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.

PDF NeurIPS Semantic Scholar

Cite

Text

Jeong and Shin. "Consistency Regularization for Certified Robustness of Smoothed Classifiers." Neural Information Processing Systems, 2020.

Markdown

[Jeong and Shin. "Consistency Regularization for Certified Robustness of Smoothed Classifiers." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/jeong2020neurips-consistency/)

BibTeX

@inproceedings{jeong2020neurips-consistency,
  title     = {{Consistency Regularization for Certified Robustness of Smoothed Classifiers}},
  author    = {Jeong, Jongheon and Shin, Jinwoo},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/jeong2020neurips-consistency/}
}