Nesterov Acceleration Despite Very Noisy Gradients

Abstract

We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.

Cite

Text

Gupta et al. "Nesterov Acceleration Despite Very Noisy Gradients." Neural Information Processing Systems, 2024. doi:10.52202/079017-0653

Markdown

[Gupta et al. "Nesterov Acceleration Despite Very Noisy Gradients." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/gupta2024neurips-nesterov/) doi:10.52202/079017-0653

BibTeX

@inproceedings{gupta2024neurips-nesterov,
  title     = {{Nesterov Acceleration Despite Very Noisy Gradients}},
  author    = {Gupta, Kanan and Siegel, Jonathan W. and Wojtowytsch, Stephan},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0653},
  url       = {https://mlanthology.org/neurips/2024/gupta2024neurips-nesterov/}
}