Demystifying Poisoning Backdoor Attacks from a Statistical Perspective
Abstract
Backdoor attacks pose a significant security risk to machine learning applications due to their stealthy nature and potentially serious consequences. Such attacks involve embedding triggers within a learning model with the intention of causing malicious behavior when an active trigger is present while maintaining regular functionality without it. This paper derives a fundamental understanding of backdoor attacks that applies to both discriminative and generative models, including diffusion models and large language models. We evaluate the effectiveness of any backdoor attack incorporating a constant trigger, by establishing tight lower and upper boundaries for the performance of the compromised model on both clean and backdoor test data. The developed theory answers a series of fundamental but previously underexplored problems, including (1) what are the determining factors for a backdoor attack's success, (2) what is the direction of the most effective backdoor attack, and (3) when will a human-imperceptible trigger succeed. We demonstrate the theory by conducting experiments using benchmark datasets and state-of-the-art backdoor attack scenarios. Our code is available \href{https://github.com/KeyWgh/DemystifyBackdoor}here.
Cite
Text
Wang et al. "Demystifying Poisoning Backdoor Attacks from a Statistical Perspective." International Conference on Learning Representations, 2024.Markdown
[Wang et al. "Demystifying Poisoning Backdoor Attacks from a Statistical Perspective." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/wang2024iclr-demystifying/)BibTeX
@inproceedings{wang2024iclr-demystifying,
title = {{Demystifying Poisoning Backdoor Attacks from a Statistical Perspective}},
author = {Wang, Ganghua and Xian, Xun and Kundu, Ashish and Srinivasa, Jayanth and Bi, Xuan and Hong, Mingyi and Ding, Jie},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/wang2024iclr-demystifying/}
}