Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Abstract

Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness. The core idea of RiFT is to exploit the redundant capacity for robustness by fine-tuning the adversarially trained model on its non-robust-critical module. To do so, we introduce module robust criticality (MRC), a measure that evaluates the significance of a given module to model robustness under worst-case weight perturbations. Using this measure, we identify the module with the lowest MRC value as the non-robust-critical module and fine-tune its weights to obtain fine-tuned weights. Subsequently, we linearly interpolate between the adversarially trained weights and fine-tuned weights to derive the optimal fine-tuned model weights. We demonstrate the efficacy of RiFT on ResNet18, ResNet34, and WideResNet34-10 models trained on CIFAR10, CIFAR100, and Tiny-ImageNet datasets. Our experiments show that RiFT can significantly improve both generalization and out-of-distribution robust- ness by around 1.5% while maintaining or even slightly enhancing adversarial robustness. Code is available at https://github.com/Immortalise/RiFT .

Cite

Text

Zhu et al. "Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00408

Markdown

[Zhu et al. "Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zhu2023iccv-improving/) doi:10.1109/ICCV51070.2023.00408

BibTeX

@inproceedings{zhu2023iccv-improving,
  title     = {{Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning}},
  author    = {Zhu, Kaijie and Hu, Xixu and Wang, Jindong and Xie, Xing and Yang, Ge},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {4424-4434},
  doi       = {10.1109/ICCV51070.2023.00408},
  url       = {https://mlanthology.org/iccv/2023/zhu2023iccv-improving/}
}