How Private Are DP-SGD Implementations?

Abstract

We demonstrate a substantial gap between the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: (i) Shuffling, and (ii) Poisson subsampling; the typical analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) follows by interpreting it as a post-processing of ABLQ. While shuffling-based DP-SGD is more commonly used in practical implementations, it has not been amenable to easy privacy analysis, either analytically or even numerically. On the other hand, Poisson subsampling-based DP-SGD is challenging to scalably implement, but has a well-understood privacy analysis, with multiple open-source numerically tight privacy accountants available. This has led to a common practice of using shuffling-based DP-SGD in practice, but using the privacy analysis for the corresponding Poisson subsampling version. Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling, and thus advises caution in reporting privacy parameters for DP-SGD.

Cite

Text

Chua et al. "How Private Are DP-SGD Implementations?." International Conference on Machine Learning, 2024.

Markdown

[Chua et al. "How Private Are DP-SGD Implementations?." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/chua2024icml-private/)

BibTeX

@inproceedings{chua2024icml-private,
  title     = {{How Private Are DP-SGD Implementations?}},
  author    = {Chua, Lynn and Ghazi, Badih and Kamath, Pritish and Kumar, Ravi and Manurangsi, Pasin and Sinha, Amer and Zhang, Chiyuan},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {8904-8918},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/chua2024icml-private/}
}