Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness
Abstract
Adversarial Training (AT ) has been demonstrated to improve the robustness of deep neural networks (DNNs) to adversarial attacks. AT is a min-max optimization procedure wherein adversarial examples are generated to train a robust DNN. The inner maximization step of AT maximizes the losses of inputs w.r.t their actual classes. The outer minimization involves minimizing the losses on the adversarial examples obtained from the inner maximization. This work proposes a standard-deviation-inspired (SDI ) regularization term for improving adversarial robustness and generalization. We argue that the inner maximization is akin to minimizing a modified standard deviation of a model’s output probabilities. Moreover, we argue that maximizing the modified standard deviation measure may complement the outer minimization of the AT framework. To corroborate our argument, we experimentally show that the SDI measure may be utilized to craft adversarial examples. Furthermore, we show that combining the proposed SDI regularization term with existing AT variants improves the robustness of DNNs to stronger attacks (e.g., CW and Auto-attack) and improves robust generalization.
Cite
Text
Fakorede et al. "Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness." Transactions on Machine Learning Research, 2024.Markdown
[Fakorede et al. "Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/fakorede2024tmlr-standarddeviationinspired/)BibTeX
@article{fakorede2024tmlr-standarddeviationinspired,
title = {{Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness}},
author = {Fakorede, Olukorede and Atsague, Modeste and Tian, Jin},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/fakorede2024tmlr-standarddeviationinspired/}
}