A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast

Abstract

Since Nesterov's seminal 1983 work, many accelerated first-order optimization methods have been proposed, but their analyses lacks a common unifying structure. In this work, we identify a geometric structure satisfied by a wide range of first-order accelerated methods. Using this geometric insight, we present several novel generalizations of accelerated methods. Most interesting among them is a method that reduces the squared gradient norm with $\mathcal{O}(1/K^4)$ rate in the prox-grad setup, faster than the $\mathcal{O}(1/K^3)$ rates of Nesterov's FGM or Kim and Fessler's FPGM-m.

Cite

Text

Lee et al. "A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast." Neural Information Processing Systems, 2021.

Markdown

[Lee et al. "A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/lee2021neurips-geometric/)

BibTeX

@inproceedings{lee2021neurips-geometric,
  title     = {{A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast}},
  author    = {Lee, Jongmin and Park, Chanwoo and Ryu, Ernest},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/lee2021neurips-geometric/}
}