DUAL-LOCO: Distributing Statistical Estimation Using Random Projections

Abstract

We present DUAL-LOCO, a communication-efficient algorithm for distributed statistical estimation. DUAL-LOCO assumes that the data is distributed according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependences between features available to different workers. We show that DUAL-LOCO has bounded approximation error which only depends weakly on the number of workers. We compare DUAL-LOCO against a state-of-the-art distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy.

Cite

Text

Heinze et al. "DUAL-LOCO: Distributing Statistical Estimation Using Random Projections." International Conference on Artificial Intelligence and Statistics, 2016.

Markdown

[Heinze et al. "DUAL-LOCO: Distributing Statistical Estimation Using Random Projections." International Conference on Artificial Intelligence and Statistics, 2016.](https://mlanthology.org/aistats/2016/heinze2016aistats-dual/)

BibTeX

@inproceedings{heinze2016aistats-dual,
  title     = {{DUAL-LOCO: Distributing Statistical Estimation Using Random Projections}},
  author    = {Heinze, Christina and McWilliams, Brian and Meinshausen, Nicolai},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2016},
  pages     = {875-883},
  url       = {https://mlanthology.org/aistats/2016/heinze2016aistats-dual/}
}