Noise-Adaptive (Accelerated) Stochastic Heavy-Ball Momentum

Abstract

We analyze the convergence of stochastic heavy ball (SHB) momentum in the smooth, strongly-convex setting. Kidambi et al. showed that SHB (with small mini-batches) cannot attain an accelerated rate of convergence even for quadratics, and conjecture that the practical gains of SHB are a by-product of mini-batching. We substantiate this claim by showing that SHB can obtain an accelerated rate when the mini-batch size is larger than some threshold. In particular, for strongly-convex quadratics with condition number $\kappa$, we prove that SHB with the standard step-size and momentum parameters results in an $O\left(\exp(-\frac{T}{\sqrt{\kappa}}) + \sigma\right)$ convergence rate, where $T$ is the number of iterations and $\sigma^2$ is the variance in the stochastic gradients. To ensure convergence to the minimizer, we propose a multi-stage approach that results in a noise-adaptive $O\left(\exp\left(-\frac{T}{\sqrt{\kappa}}\right) + \frac{\sigma}{T}\right)$ rate. For general strongly-convex functions, we use the averaging interpretation of SHB along with exponential step-sizes to prove an $O\left(\exp\left(-\frac{T}{\kappa}\right) + \frac{\sigma^2}{T}\right)$ convergence to the minimizer. Finally, we empirically demonstrate the effectiveness of the proposed algorithms.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Dang et al. "Noise-Adaptive (Accelerated) Stochastic Heavy-Ball Momentum." NeurIPS 2023 Workshops: OPT, 2023.

Markdown

[Dang et al. "Noise-Adaptive (Accelerated) Stochastic Heavy-Ball Momentum." NeurIPS 2023 Workshops: OPT, 2023.](https://mlanthology.org/neuripsw/2023/dang2023neuripsw-noiseadaptive/)

BibTeX

@inproceedings{dang2023neuripsw-noiseadaptive,
  title     = {{Noise-Adaptive (Accelerated) Stochastic Heavy-Ball Momentum}},
  author    = {Dang, Anh Quang and Harikandeh, Reza Babanezhad and Vaswani, Sharan},
  booktitle = {NeurIPS 2023 Workshops: OPT},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/dang2023neuripsw-noiseadaptive/}
}