Protecting Against Simultaneous Data Poisoning Attacks

Abstract

Current backdoor defense methods are evaluated against a single attack at a time. This is unrealistic, as powerful machine learning systems are trained on large datasets scraped from the internet, which may be attacked multiple times by one or more attackers. We demonstrate that multiple backdoors can be simultaneously installed in a single model through parallel data poisoning attacks without substantially degrading clean accuracy. Furthermore, we show that existing backdoor defense methods do not effectively defend against multiple simultaneous attacks. Finally, we leverage insights into the nature of backdoor attacks to develop a new defense, BaDLoss (**Ba**ckdoor **D**etection via **Loss** Dynamics), that is effective in the multi-attack setting. With minimal clean accuracy degradation, BaDLoss attains an average attack success rate in the multi-attack setting of 7.98% in CIFAR-10, 10.29% in GTSRB, and 19.17% in Imagenette, compared to the average of other defenses at 63.44%, 74.83%, and 41.74% respectively. BaDLoss scales to ImageNet-1k, reducing the average attack success rate from 88.57% to 15.61%.

Cite

Text

Alex et al. "Protecting Against Simultaneous Data Poisoning Attacks." International Conference on Learning Representations, 2025.

Markdown

[Alex et al. "Protecting Against Simultaneous Data Poisoning Attacks." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/alex2025iclr-protecting/)

BibTeX

@inproceedings{alex2025iclr-protecting,
  title     = {{Protecting Against Simultaneous Data Poisoning Attacks}},
  author    = {Alex, Neel and Siddiqui, Shoaib Ahmed and Sanyal, Amartya and Krueger, David},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/alex2025iclr-protecting/}
}