Random Shuffling Beats SGD After Finite Epochs
Abstract
A long-standing problem in stochastic optimization is proving that \rsgd, the without-replacement version of \sgd, converges faster than the usual with-replacement \sgd. Building upon \citep{gurbuzbalaban2015random}, we present the first (to our knowledge) non-asymptotic results for this problem by proving that after a reasonable number of epochs \rsgd converges faster than \sgd. Specifically, we prove that for strongly convex, second-order smooth functions, the iterates of \rsgd converge to the optimal solution as $\mathcal{O}(\nicefrac{1}{T^2} + \nicefrac{n^3}{T^3})$, where $n$ is the number of components in the objective, and $T$ is number of iterations. This result implies that after $\mathcal{O}(\sqrt{n})$ epochs, \rsgd is strictly better than \sgd (which converges as $\mathcal{O}(\nicefrac{1}{T})$). The key step toward showing this better dependence on $T$ is the introduction of $n$ into the bound; and as our analysis shows, in general a dependence on $n$ is unavoidable without further changes. To understand how \rsgd works in practice, we further explore two empirically useful settings: data sparsity and over-parameterization. For sparse data, \rsgd has the rate $\mathcal{O}\left(\frac{1}{T^2}\right)$, again strictly better than \sgd. Under a setting closely related to over-parameterization, \rsgd is shown to converge faster than \sgd after any arbitrary number of iterations. Finally, we extend the analysis of \rsgd to smooth non-convex and convex functions.
Cite
Text
Haochen and Sra. "Random Shuffling Beats SGD After Finite Epochs." International Conference on Machine Learning, 2019.Markdown
[Haochen and Sra. "Random Shuffling Beats SGD After Finite Epochs." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/haochen2019icml-random/)BibTeX
@inproceedings{haochen2019icml-random,
title = {{Random Shuffling Beats SGD After Finite Epochs}},
author = {Haochen, Jeff and Sra, Suvrit},
booktitle = {International Conference on Machine Learning},
year = {2019},
pages = {2624-2633},
volume = {97},
url = {https://mlanthology.org/icml/2019/haochen2019icml-random/}
}