On Variance Reduction in Stochastic Gradient Descent and Its Asynchronous Variants

Abstract

We study optimization algorithms based on variance reduction for stochastic gradientdescent (SGD). Remarkable recent progress has been made in this directionthrough development of algorithms like SAG, SVRG, SAGA. These algorithmshave been shown to outperform SGD, both theoretically and empirically. However,asynchronous versions of these algorithms—a crucial requirement for modernlarge-scale applications—have not been studied. We bridge this gap by presentinga unifying framework that captures many variance reduction techniques.Subsequently, we propose an asynchronous algorithm grounded in our framework,with fast convergence rates. An important consequence of our general approachis that it yields asynchronous versions of variance reduction algorithms such asSVRG, SAGA as a byproduct. Our method achieves near linear speedup in sparsesettings common to machine learning. We demonstrate the empirical performanceof our method through a concrete realization of asynchronous SVRG.

Cite

Text

Reddi et al. "On Variance Reduction in Stochastic Gradient Descent and Its Asynchronous Variants." Neural Information Processing Systems, 2015.

Markdown

[Reddi et al. "On Variance Reduction in Stochastic Gradient Descent and Its Asynchronous Variants." Neural Information Processing Systems, 2015.](https://mlanthology.org/neurips/2015/reddi2015neurips-variance/)

BibTeX

@inproceedings{reddi2015neurips-variance,
  title     = {{On Variance Reduction in Stochastic Gradient Descent and Its Asynchronous Variants}},
  author    = {Reddi, Sashank J. and Hefny, Ahmed and Sra, Suvrit and Poczos, Barnabas and Smola, Alexander J},
  booktitle = {Neural Information Processing Systems},
  year      = {2015},
  pages     = {2647-2655},
  url       = {https://mlanthology.org/neurips/2015/reddi2015neurips-variance/}
}