Sparsity-Preserving Differentially Private Training of Large Embedding Models
Abstract
As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during the private training of large embedding models. Our algorithms achieve substantial reductions ($10^6 \times$) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.
Cite
Text
Ghazi et al. "Sparsity-Preserving Differentially Private Training of Large Embedding Models." Neural Information Processing Systems, 2023.Markdown
[Ghazi et al. "Sparsity-Preserving Differentially Private Training of Large Embedding Models." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/ghazi2023neurips-sparsitypreserving/)BibTeX
@inproceedings{ghazi2023neurips-sparsitypreserving,
title = {{Sparsity-Preserving Differentially Private Training of Large Embedding Models}},
author = {Ghazi, Badih and Huang, Yangsibo and Kamath, Pritish and Kumar, Ravi and Manurangsi, Pasin and Sinha, Amer and Zhang, Chiyuan},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/ghazi2023neurips-sparsitypreserving/}
}