Feature-Distributed Sparse Regression: A Screen-and-Clean Approach

Abstract

Most existing approaches to distributed sparse regression assume the data is partitioned by samples. However, for high-dimensional data (D >> N), it is more natural to partition the data by features. We propose an algorithm to distributed sparse regression when the data is partitioned by features rather than samples. Our approach allows the user to tailor our general method to various distributed computing platforms by trading-off the total amount of data (in bits) sent over the communication network and the number of rounds of communication. We show that an implementation of our approach is capable of solving L1-regularized L2 regression problems with millions of features in minutes.

Cite

Text

Yang et al. "Feature-Distributed Sparse Regression: A Screen-and-Clean Approach." Neural Information Processing Systems, 2016.

Markdown

[Yang et al. "Feature-Distributed Sparse Regression: A Screen-and-Clean Approach." Neural Information Processing Systems, 2016.](https://mlanthology.org/neurips/2016/yang2016neurips-featuredistributed/)

BibTeX

@inproceedings{yang2016neurips-featuredistributed,
  title     = {{Feature-Distributed Sparse Regression: A Screen-and-Clean Approach}},
  author    = {Yang, Jiyan and Mahoney, Michael W. and Saunders, Michael and Sun, Yuekai},
  booktitle = {Neural Information Processing Systems},
  year      = {2016},
  pages     = {2712-2720},
  url       = {https://mlanthology.org/neurips/2016/yang2016neurips-featuredistributed/}
}