SAIF: Sparse Adversarial and Imperceptible Attack Framework
Abstract
Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. For instance, adding calculated small distortions to images can deceive a well-trained image classification network. In this work, we propose a novel attack technique called \textbf{S}parse \textbf{A}dversarial and \textbf{I}mperceptible Attack \textbf{F}ramework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a few pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and largely outperforms state-of-the-art sparse attack methods on ImageNet and CIFAR-10.
Cite
Text
Imtiaz et al. "SAIF: Sparse Adversarial and Imperceptible Attack Framework." Transactions on Machine Learning Research, 2025.Markdown
[Imtiaz et al. "SAIF: Sparse Adversarial and Imperceptible Attack Framework." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/imtiaz2025tmlr-saif/)BibTeX
@article{imtiaz2025tmlr-saif,
title = {{SAIF: Sparse Adversarial and Imperceptible Attack Framework}},
author = {Imtiaz, Tooba and Kohler, Morgan R and Miller, Jared F and Wang, Zifeng and Eskandar, Masih and Sznaier, Mario and Camps, Octavia and Dy, Jennifer},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/imtiaz2025tmlr-saif/}
}