Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation

Abstract

We consider the problem of estimating the mean of a set of vectors, which are stored in a distributed system. This is a fundamental task with applications in distributed SGD and many other distributed problems, where communication is a main bottleneck for scaling up computations. We propose a new sparsity-aware algorithm, which improves previous results both theoretically and empirically. The communication cost of our algorithm is characterized by Hoyer's measure of sparseness. Moreover, we prove that the communication cost of our algorithm is information-theoretic optimal up to a constant factor in all sparseness regime. We have also conducted experimental studies, which demonstrate the advantages of our method and confirm our theoretical findings.

Cite

Text

Huang et al. "Optimal Sparsity-Sensitive Bounds for  Distributed Mean Estimation." Neural Information Processing Systems, 2019.

Markdown

[Huang et al. "Optimal Sparsity-Sensitive Bounds for  Distributed Mean Estimation." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/huang2019neurips-optimal/)

BibTeX

@inproceedings{huang2019neurips-optimal,
  title     = {{Optimal Sparsity-Sensitive Bounds for  Distributed Mean Estimation}},
  author    = {Huang, Zengfeng and Huang, Ziyue and Wang, Yilei and Yi, Ke},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {6371-6381},
  url       = {https://mlanthology.org/neurips/2019/huang2019neurips-optimal/}
}