DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations

Abstract

Data poisoning and backdoor attacks manipulate training data to induce security breaches in a victim model. These attacks can be provably deflected using differentially private (DP) training methods, although this comes with a sharp decrease in model performance. The InstaHide method has recently been proposed as an alternative to DP training that leverages supposed privacy properties of the mixup augmentation, although without rigorous guarantees. In this paper, we rigorously show that $k$-way mixup provably yields at least $k$ times stronger DP guarantees than a naive DP mechanism, and we observe that this enhanced privacy guarantee is a strong foundation for building defenses against poisoning.

Cite

Text

Borgnia et al. "DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations." NeurIPS 2022 Workshops: MLSW, 2022.

Markdown

[Borgnia et al. "DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations." NeurIPS 2022 Workshops: MLSW, 2022.](https://mlanthology.org/neuripsw/2022/borgnia2022neuripsw-dpinstahide/)

BibTeX

@inproceedings{borgnia2022neuripsw-dpinstahide,
  title     = {{DP-InstaHide: Data Augmentations Provably Enhance Guarantees Against Dataset Manipulations}},
  author    = {Borgnia, Eitan and Geiping, Jonas and Cherepanova, Valeriia and Fowl, Liam H and Gupta, Arjun and Ghiasi, Amin and Huang, Furong and Goldblum, Micah and Goldstein, Tom},
  booktitle = {NeurIPS 2022 Workshops: MLSW},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/borgnia2022neuripsw-dpinstahide/}
}