Communication Efficient Coresets for Empirical Loss Minimization
Abstract
In this paper, we study the problem of empirical loss minimization with l2-regularization in distributed settings with significant communication cost. Stochastic gradient descent (SGD) and its variants are popular techniques for solving these problems in large-scale applications. However, the communication cost of these techniques is usually high, thus leading to considerable performance degradation. We introduce a novel approach to reduce the communication cost while retaining good convergence properties. The key to our approach is the construction of a small summary of the data, called coreset, at each iteration and solve an easy optimization problem based on the coreset. We present a general framework for analyzing coreset-based optimization and provide interesting insights into existing algorithms from this perspective. We then propose a new coreset construction and provide its convergence analysis for a wide class of problems that include logistic regression and support vector machines. We demonstrate the performance of our algorithm on real-world datasets and compare it against state-of-the-art algorithms.
Cite
Text
Reddi et al. "Communication Efficient Coresets for Empirical Loss Minimization." Conference on Uncertainty in Artificial Intelligence, 2015.Markdown
[Reddi et al. "Communication Efficient Coresets for Empirical Loss Minimization." Conference on Uncertainty in Artificial Intelligence, 2015.](https://mlanthology.org/uai/2015/reddi2015uai-communication/)BibTeX
@inproceedings{reddi2015uai-communication,
title = {{Communication Efficient Coresets for Empirical Loss Minimization}},
author = {Reddi, Sashank J. and Póczos, Barnabás and Smola, Alexander J.},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2015},
pages = {752-761},
url = {https://mlanthology.org/uai/2015/reddi2015uai-communication/}
}