Stochastic Polyak Step-Sizes and Momentum: Convergence Guarantees and Practical Performance

Abstract

Stochastic gradient descent with momentum, also known as Stochastic Heavy Ball method (SHB), is one of the most popular algorithms for solving large-scale stochastic optimization problems in various machine learning tasks. In practical scenarios, tuning the step-size and momentum parameters of the method is a prohibitively expensive and time-consuming process. In this work, inspired by the recent advantages of stochastic Polyak step-size in the performance of stochastic gradient descent (SGD), we propose and explore new Polyak-type variants suitable for the update rule of the SHB method. In particular, using the Iterate Moving Average (IMA) viewpoint of SHB, we propose and analyze three novel step-size selections: MomSPSmax, MomDecSPS, and MomAdaSPS. For MomSPSmax, we provide convergence guarantees for SHB to a neighborhood of the solution for convex and smooth problems (without assuming interpolation). If interpolation is also satisfied, then using MomSPSmax, SHB converges to the true solution at a fast rate matching the deterministic HB. The other two variants, MomDecSPS and MomAdaSPS, are the first adaptive step-size for SHB that guarantee convergence to the exact minimizer - without a priori knowledge of the problem parameters and without assuming interpolation. Our convergence analysis of SHB is tight and obtains the convergence guarantees of stochastic Polyak step-size for SGD as a special case. We supplement our analysis with experiments validating our theory and demonstrating the effectiveness and robustness of our algorithms.

Cite

Text

Oikonomou and Loizou. "Stochastic Polyak Step-Sizes and Momentum: Convergence Guarantees and Practical Performance." International Conference on Learning Representations, 2025.

Markdown

[Oikonomou and Loizou. "Stochastic Polyak Step-Sizes and Momentum: Convergence Guarantees and Practical Performance." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/oikonomou2025iclr-stochastic/)

BibTeX

@inproceedings{oikonomou2025iclr-stochastic,
  title     = {{Stochastic Polyak Step-Sizes and Momentum: Convergence Guarantees and Practical Performance}},
  author    = {Oikonomou, Dimitris and Loizou, Nicolas},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/oikonomou2025iclr-stochastic/}
}