Unlearning Tabular Data Without a "Forget Set''

Abstract

Machine unlearning is the process of removing the influence of some subset of the training data from the parameters of a previously-trained model. Existing methods typically require direct access to the “forget set" – the subset of training data to be forgotten by the model. This limitation impedes privacy, as organizations need to retain user data for the sake of unlearning when a request for deletion is made, rather than being able to delete it immediately. We introduce RELOAD, an approximate unlearning algorithm that leverages ideas from gradient-based unlearning and neural network sparsity to achieve blind unlearning in settings of tabular data. The method serially applies an ascent step with targeted parameter re-initialization and fine-tuning, and on empirical unlearning tasks, RELOAD often approximates the behaviour of a from-scratch retrained model better than approaches that leverage the forget set. Empirical results highlight how RELOAD has the potential to improve privacy-preserving machine learning in the tabular setting

Cite

Text

Newatia et al. "Unlearning Tabular Data Without a "Forget Set''." NeurIPS 2024 Workshops: TRL, 2024.

Markdown

[Newatia et al. "Unlearning Tabular Data Without a "Forget Set''." NeurIPS 2024 Workshops: TRL, 2024.](https://mlanthology.org/neuripsw/2024/newatia2024neuripsw-unlearning/)

BibTeX

@inproceedings{newatia2024neuripsw-unlearning,
  title     = {{Unlearning Tabular Data Without a "Forget Set''}},
  author    = {Newatia, Aviraj and Cooper, Michael and Krishnan, Rahul},
  booktitle = {NeurIPS 2024 Workshops: TRL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/newatia2024neuripsw-unlearning/}
}