Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy

Abstract

We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. Previous work assume that the number of data samples is the same across users. In contrast, we consider a more general and realistic scenario where each user $u \in [n]$ owns $m_u$ data samples drawn from some generative distribution $\mu$; $m_u$ being unknown to the statistician but drawn from a known distribution $M$ over $\mathbb{N}$. Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $\mu$ for the task of mean estimation. We then derive a lower bound. The two bounds are asymptotically matching up to logarithmic factors and reduce to known bounds when $m_u = m$ for any user $u$.

Cite

Text

Pla et al. "Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.

Markdown

[Pla et al. "Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/pla2025aistats-distributionaware/)

BibTeX

@inproceedings{pla2025aistats-distributionaware,
  title     = {{Distribution-Aware Mean Estimation Under User-Level Local Differential Privacy}},
  author    = {Pla, Corentin and Vono, Maxime and Richard, Hugo},
  booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  year      = {2025},
  pages     = {2089-2097},
  volume    = {258},
  url       = {https://mlanthology.org/aistats/2025/pla2025aistats-distributionaware/}
}