Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness

Emanuele Ballarin, Alessio Ansuini, Luca Bortolussi

TMLR 2025

/tmlr/2025/ballarin2025tmlr-blending/

Abstract

In this work, we propose a novel adversarial defence mechanism for image classification - CARSO - blending the paradigms of adversarial training and adversarial purification in a synergistic robustness-enhancing way. The method builds upon an adversarially-trained classifier, and learns to map its internal representation associated with a potentially perturbed input onto a distribution of tentative clean reconstructions. Multiple samples from such distribution are classified by the same adversarially-trained model, and a carefully chosen aggregation of its outputs finally constitutes the robust prediction of interest. Experimental evaluation by a well-established benchmark of strong adaptive attacks, across different image datasets, shows that CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for stochastic defences. With a modest clean accuracy penalty, our method improves by a significant margin the state-of-the-art for Cifar-10, Cifar-100, and TinyImageNet-200 $\ell_\infty$ robust classification accuracy against AutoAttack.

PDF TMLR Code Semantic Scholar

Cite

Text

Ballarin et al. "Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness." Transactions on Machine Learning Research, 2025.

Markdown

[Ballarin et al. "Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/ballarin2025tmlr-blending/)

BibTeX

@article{ballarin2025tmlr-blending,
  title     = {{Blending Adversarial Training and Representation-Conditional Purification via Aggregation Improves Adversarial Robustness}},
  author    = {Ballarin, Emanuele and Ansuini, Alessio and Bortolussi, Luca},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/ballarin2025tmlr-blending/}
}