Understanding and Improving Fast Adversarial Training Against $l_0$ Bounded Perturbations
Abstract
This work studies fast adversarial training against sparse adversarial perturbations bounded by $l_0$ norm. We first demonstrate the unique challenges of employing $1$-step attacks on $l_0$ bounded perturbations, especially catastrophic overfitting (CO) that cannnot be properly addressed by existing fast adversarial training method for other $l_p$ norms ($p \geq 1$). We highlight that CO in $l_0$ adversarial training arises from sub-optimal perturbation locations of $1$-step attack. Some strategies like multi-$\epsilon$ can mitigate this sub-optimality to some extent, they lead to unstable training in turn. Theoretical and numerical analyses also reveal that the loss landscape of $l_0$ adversarial training is more craggy than its $l_\infty$, $l_2$ and $l_1$ counterparts, which exaggerates CO. To address this issue, we adopt soft labels and the trade-off loss function to smooth the adversarial loss landscape. Extensive experiments demonstrate our method can overcome the challenge of CO, achieve state-of-the-art performance, and narrow the performance gap between $1$-step and multi-step adversarial training against sparse attacks.
Cite
Text
Zhong et al. "Understanding and Improving Fast Adversarial Training Against $l_0$ Bounded Perturbations." Advances in Neural Information Processing Systems, 2025.Markdown
[Zhong et al. "Understanding and Improving Fast Adversarial Training Against $l_0$ Bounded Perturbations." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/zhong2025neurips-understanding/)BibTeX
@inproceedings{zhong2025neurips-understanding,
title = {{Understanding and Improving Fast Adversarial Training Against $l_0$ Bounded Perturbations}},
author = {Zhong, Xuyang and Huang, Yixiao and Liu, Chen},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/zhong2025neurips-understanding/}
}