Reconstruction Attacks on Machine Unlearning: Simple Models Are Vulnerable

Abstract

Machine unlearning is motivated by principles of data autonomy. The premise is that a person can request to have their data's influence removed from deployed models, and those models should be updated as if they were retrained without the person's data. We show that these updates expose individuals to high-accuracy reconstruction attacks which allow the attacker to recover their data in its entirety, even when the original models are so simple that privacy risk might not otherwise have been a concern. We show how to mount a near-perfect attack on the deleted data point from linear regression models. We then generalize our attack to other loss functions and architectures, and empirically demonstrate the effectiveness of our attacks across a wide range of datasets (capturing both tabular and image data). Our work highlights that privacy risk is significant even for extremely simple model classes when individuals can request deletion of their data from the model.

Cite

Text

Bertran et al. "Reconstruction Attacks on Machine Unlearning: Simple Models Are Vulnerable." Neural Information Processing Systems, 2024. doi:10.52202/079017-3334

Markdown

[Bertran et al. "Reconstruction Attacks on Machine Unlearning: Simple Models Are Vulnerable." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/bertran2024neurips-reconstruction/) doi:10.52202/079017-3334

BibTeX

@inproceedings{bertran2024neurips-reconstruction,
  title     = {{Reconstruction Attacks on Machine Unlearning: Simple Models Are Vulnerable}},
  author    = {Bertran, Martin and Tang, Shuai and Kearns, Michael and Morgenstern, Jamie and Roth, Aaron and Wu, Zhiwei Steven},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3334},
  url       = {https://mlanthology.org/neurips/2024/bertran2024neurips-reconstruction/}
}