Sample-Efficient Private Learning of Mixtures of Gaussians
Abstract
We study the problem of learning mixtures of Gaussians with approximate differential privacy. We prove that roughly $kd^2 + k^{1.5} d^{1.75} + k^2 d$ samples suffice to learn a mixture of $k$ arbitrary $d$-dimensional Gaussians up to low total variation distance, with differential privacy. Our work improves over the previous best result (which required roughly $k^2 d^4$ samples) and is provably optimal when $d$ is much larger than $k^2$. Moreover, we give the first optimal bound for privately learning mixtures of $k$ univariate (i.e., $1$-dimensional) Gaussians. Importantly, we show that the sample complexity for learning mixtures of univariate Gaussians is linear in the number of components $k$, whereas the previous best sample complexity was quadratic in $k$. Our algorithms utilize various techniques, including the inverse sensitivity mechanism, sample compression for distributions, and methods for bounding volumes of sumsets.
Cite
Text
Ashtiani et al. "Sample-Efficient Private Learning of Mixtures of Gaussians." Neural Information Processing Systems, 2024. doi:10.52202/079017-3141Markdown
[Ashtiani et al. "Sample-Efficient Private Learning of Mixtures of Gaussians." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/ashtiani2024neurips-sampleefficient/) doi:10.52202/079017-3141BibTeX
@inproceedings{ashtiani2024neurips-sampleefficient,
title = {{Sample-Efficient Private Learning of Mixtures of Gaussians}},
author = {Ashtiani, Hassan and Majid, Mahbod and Narayanan, Shyam},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-3141},
url = {https://mlanthology.org/neurips/2024/ashtiani2024neurips-sampleefficient/}
}