Communication-Efficient Algorithms for Statistical Optimization

Abstract

We study two communication-efficient algorithms for distributed statistical optimization on large-scale data. The first algorithm is an averaging method that distributes the $N$ data samples evenly to $m$ machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error that decays as $\order(N^{-1}+(N/m)^{-2})$. Whenever $m \le \sqrt{N}$, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all $N$ samples. The second algorithm is a novel method, based on an appropriate form of the bootstrap. Requiring only a single round of communication, it has mean-squared error that decays as $\order(N^{-1}+(N/m)^{-3})$, and so is more robust to the amount of parallelization. We complement our theoretical results with experiments on large-scale problems from the Microsoft Learning to Rank dataset.

Cite

Text

Zhang et al. "Communication-Efficient Algorithms for Statistical Optimization." Neural Information Processing Systems, 2012.

Markdown

[Zhang et al. "Communication-Efficient Algorithms for Statistical Optimization." Neural Information Processing Systems, 2012.](https://mlanthology.org/neurips/2012/zhang2012neurips-communicationefficient/)

BibTeX

@inproceedings{zhang2012neurips-communicationefficient,
  title     = {{Communication-Efficient Algorithms for Statistical Optimization}},
  author    = {Zhang, Yuchen and Wainwright, Martin J. and Duchi, John C.},
  booktitle = {Neural Information Processing Systems},
  year      = {2012},
  pages     = {1502-1510},
  url       = {https://mlanthology.org/neurips/2012/zhang2012neurips-communicationefficient/}
}