Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy
Abstract
We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. Previous work assume that the number of data samples is the same across users. In contrast, we consider a more general and realistic scenario where each user $u \in [n]$ owns $m_u$ data samples drawn from some generative distribution $\mu$; $m_u$ being unknown to the statistician but drawn from a known distribution $M$ over $\mathbb{N}$. Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $\mu$ for the task of mean estimation. We then derive a lower bound. The two bounds are asymptotically matching up to logarithmic factors and reduce to known bounds when $m_u = m$ for any user $u$.
Cite
Text
Pla et al. "Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.Markdown
[Pla et al. "Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/pla2025aistats-distributionaware/)BibTeX
@inproceedings{pla2025aistats-distributionaware,
title = {{Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy}},
author = {Pla, Corentin and Vono, Maxime and Richard, Hugo},
booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
year = {2025},
pages = {2089-2097},
volume = {258},
url = {https://mlanthology.org/aistats/2025/pla2025aistats-distributionaware/}
}