Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness
Abstract
For an extensive period, Vision Transformers (ViTs) have been deemed unsuitable for attaining robust performance on small-scale datasets, with WideResNet models maintaining dominance in this domain. While WideResNet models have persistently set the state-of-the-art (SOTA) benchmarks for robust accuracy on datasets such as CIFAR-10 and CIFAR-100, this paper challenges the prevailing belief that only WideResNet can excel in this context. We pose the critical question of whether ViTs can surpass the robust accuracy of WideResNet models. Our results provide a resounding affirmative answer. By employing ViT, enhanced with data generated by a diffusion model for adversarial training, we demonstrate that ViTs can indeed outshine WideResNet in terms of robust accuracy. Specifically, under the Infty-norm threat model with epsilon = 8/255, our approach achieves robust accuracies of 74.97% on CIFAR-10 and 44.07% on CIFAR-100, representing improvements of +3.9% and +1.4%, respectively, over the previous SOTA models. Notably, our ViT-B/2 model, with 3 times fewer parameters, surpasses the previously best-performing WRN-70-16. Our achievement opens a new avenue, suggesting that future models employing ViTs or other novel efficient architectures could eventually replace the long-dominant WRN models.
Cite
Text
Wu et al. "Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I1.32073Markdown
[Wu et al. "Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wu2025aaai-vision/) doi:10.1609/AAAI.V39I1.32073BibTeX
@inproceedings{wu2025aaai-vision,
title = {{Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness}},
author = {Wu, Juntao and Song, Ziyu and Zhang, Xiaoyu and Xie, Shujun and Lin, Longxin and Wang, Ke},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {886-894},
doi = {10.1609/AAAI.V39I1.32073},
url = {https://mlanthology.org/aaai/2025/wu2025aaai-vision/}
}