Fast and Memory Efficient Differentially Private-SGD via JL Projections
Abstract
Differentially Private-SGD (DP-SGD) of Abadi et al. and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-JL. Our approach uses Johnson–Lindenstrauss (JL) projections to quickly approximate the per-sample gradient norms without exactly computing them, thus making the training time and memory requirements of our optimizers closer to that of their non-DP versions. Unlike previous attempts to make DP-SGD faster which work only on a subset of network architectures or use compiler techniques, we propose an algorithmic solution which works for any network in a black-box manner which is the main contribution of this paper. To illustrate this, on IMDb dataset, we train a Recurrent Neural Network (RNN) to achieve good privacy-vs-accuracy tradeoff, while being significantly faster than DP-SGD and with a similar memory footprint as non-private SGD.
Cite
Text
Bu et al. "Fast and Memory Efficient Differentially Private-SGD via JL Projections." Neural Information Processing Systems, 2021.Markdown
[Bu et al. "Fast and Memory Efficient Differentially Private-SGD via JL Projections." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/bu2021neurips-fast/)BibTeX
@inproceedings{bu2021neurips-fast,
title = {{Fast and Memory Efficient Differentially Private-SGD via JL Projections}},
author = {Bu, Zhiqi and Gopi, Sivakanth and Kulkarni, Janardhan and Lee, Yin Tat and Shen, Hanwen and Tantipongpipat, Uthaipon},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/bu2021neurips-fast/}
}