SGD with Memory: Fundamental Properties and Stochastic Acceleration

Abstract

An important open problem is the theoretically feasible acceleration of mini-batch SGD-type algorithms on quadratic problems with power-law spectrum. In the non-stochastic setting, the optimal exponent $\xi$ in the loss convergence $L_t\sim C_Lt^{-\xi}$ is double that in plain GD and is achievable using Heavy Ball (HB) with a suitable schedule; this no longer works in the presence of mini-batch noise. We address this challenge by considering first-order methods with an arbitrary fixed number $M$ of auxiliary velocity vectors (*memory-$M$ algorithms*). We first prove an equivalence between two forms of such algorithms and describe them in terms of suitable characteristic polynomials. Then we develop a general expansion of the loss in terms of *signal and noise propagators*. Using it, we show that losses of stationary stable memory-$M$ algorithms always retain the exponent $\xi$ of plain GD, but can have different constants $C_L$ depending on their *effective learning rate* that generalizes that of HB. We prove that in memory-1 algorithms we can make $C_L$ arbitrarily small while maintaining stability. As a consequence, we propose a memory-1 algorithm with a time-dependent schedule that we show heuristically and experimentally to improve the exponent $\xi$ of plain SGD.

Cite

Text

Yarotsky and Velikanov. "SGD with Memory: Fundamental Properties and Stochastic Acceleration." International Conference on Learning Representations, 2025.

Markdown

[Yarotsky and Velikanov. "SGD with Memory: Fundamental Properties and Stochastic Acceleration." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/yarotsky2025iclr-sgd/)

BibTeX

@inproceedings{yarotsky2025iclr-sgd,
  title     = {{SGD with Memory: Fundamental Properties and Stochastic Acceleration}},
  author    = {Yarotsky, Dmitry and Velikanov, Maksim},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/yarotsky2025iclr-sgd/}
}