Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

Abstract

We develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches has become a golden standard in the machine learning community, because the mini-batch techniques stabilize the gradient estimate and can easily make good use of parallel computing. The core of our proposed method is the incorporation of our new ``double acceleration'' technique and variance reduction technique. We theoretically analyze our proposed method and show that our method much improves the mini-batch efficiencies of previous accelerated stochastic methods, and essentially only needs size $\sqrt{n}$ mini-batches for achieving the optimal iteration complexities for both non-strongly and strongly convex objectives, where $n$ is the training set size. Further, we show that even in non-mini-batch settings, our method achieves the best known convergence rate for non-strongly convex and strongly convex objectives.

Cite

Text

Murata and Suzuki. "Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization." Neural Information Processing Systems, 2017.

Markdown

[Murata and Suzuki. "Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/murata2017neurips-doubly/)

BibTeX

@inproceedings{murata2017neurips-doubly,
  title     = {{Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization}},
  author    = {Murata, Tomoya and Suzuki, Taiji},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {608-617},
  url       = {https://mlanthology.org/neurips/2017/murata2017neurips-doubly/}
}