A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Abstract

Polyak–Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing non-asymptotic concentration bounds for the error of averaged SA iterates. Our approach assumes access to individual concentration bounds for the unaveraged iterates and yields a sharp bound on the averaged iterates. We also construct an example, showing the tightness of our result up to constant multiplicative factors. As direct applications, we derive tight concentration bounds for contractive SA algorithms and for algorithms such as temporal difference learning and $Q$-learning with averaging, obtaining new bounds in settings where traditional analysis is challenging.

Cite

Text

Khodadadian and Zubeldia. "A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging." Advances in Neural Information Processing Systems, 2025.

Markdown

[Khodadadian and Zubeldia. "A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/khodadadian2025neurips-generalpurpose/)

BibTeX

@inproceedings{khodadadian2025neurips-generalpurpose,
  title     = {{A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging}},
  author    = {Khodadadian, Sajad and Zubeldia, Martin},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/khodadadian2025neurips-generalpurpose/}
}