PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics

Abstract

Train-time data poisoning attacks threaten machine learning models by introducing adversarial examples during training, leading to misclassification. Current defense methods often reduce generalization performance, are attack-specific, and impose significant training overhead. To address this, we introduce a set of universal data purification methods using a stochastic transform, $\Psi(x)$, realized via iterative Langevin dynamics of Energy-Based Models (EBMs), Denoising Diffusion Probabilistic Models (DDPMs), or both. These approaches purify poisoned data with minimal impact on classifier generalization. Our specially trained EBMs and DDPMs provide state-of-the-art defense against various attacks (including Narcissus, Bullseye Polytope, Gradient Matching) on CIFAR-10, Tiny-ImageNet, and CINIC-10, without needing attack or classifier-specific information. We discuss performance trade-offs and show that our methods remain highly effective even with poisoned or distributionally shifted generative model training data.

Cite

Text

Bhat et al. "PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics." Neural Information Processing Systems, 2024. doi:10.52202/079017-4303

Markdown

[Bhat et al. "PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/bhat2024neurips-puregen/) doi:10.52202/079017-4303

BibTeX

@inproceedings{bhat2024neurips-puregen,
  title     = {{PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics}},
  author    = {Bhat, Sunay and Jiang, Jeffrey and Pooladzandi, Omead and Branch, Alexander and Pottie, Gregory},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-4303},
  url       = {https://mlanthology.org/neurips/2024/bhat2024neurips-puregen/}
}