Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness
Abstract
In this work, we propose a novel adversarial defence mechanism for image classification - CARSO - blending the paradigms of adversarial training and adversarial purification in a synergistic robustness-enhancing way. The method builds upon an adversarially-trained classifier, and learns to map its internal representation associated with a potentially perturbed input onto a distribution of tentative clean reconstructions. Multiple samples from such distribution are classified by the same adversarially-trained model, and a carefully chosen aggregation of its outputs finally constitutes the robust prediction of interest. Experimental evaluation by a well-established benchmark of strong adaptive attacks, across different image datasets, shows that CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for stochastic defences. With a modest clean accuracy penalty, our method improves by a significant margin the state-of-the-art for Cifar-10, Cifar-100, and TinyImageNet-200 $\ell_\infty$ robust classification accuracy against AutoAttack.
Cite
Text
Ballarin et al. "Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness." Transactions on Machine Learning Research, 2025.Markdown
[Ballarin et al. "Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/ballarin2025tmlr-blending/)BibTeX
@article{ballarin2025tmlr-blending,
title = {{Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness}},
author = {Ballarin, Emanuele and Ansuini, Alessio and Bortolussi, Luca},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/ballarin2025tmlr-blending/}
}