A Dynamical Systems Perspective on Nesterov Acceleration

Abstract

We present a dynamical system framework for understanding Nesterov’s accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.

Cite

Text

Muehlebach and Jordan. "A Dynamical Systems Perspective on Nesterov Acceleration." International Conference on Machine Learning, 2019.

Markdown

[Muehlebach and Jordan. "A Dynamical Systems Perspective on Nesterov Acceleration." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/muehlebach2019icml-dynamical/)

BibTeX

@inproceedings{muehlebach2019icml-dynamical,
  title     = {{A Dynamical Systems Perspective on Nesterov Acceleration}},
  author    = {Muehlebach, Michael and Jordan, Michael},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {4656-4662},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/muehlebach2019icml-dynamical/}
}