A Generic Approach for Escaping Saddle Points

Abstract

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

Cite

Text

Reddi et al. "A Generic Approach for Escaping Saddle Points." International Conference on Artificial Intelligence and Statistics, 2018.

Markdown

[Reddi et al. "A Generic Approach for Escaping Saddle Points." International Conference on Artificial Intelligence and Statistics, 2018.](https://mlanthology.org/aistats/2018/reddi2018aistats-generic/)

BibTeX

@inproceedings{reddi2018aistats-generic,
  title     = {{A Generic Approach for Escaping Saddle Points}},
  author    = {Reddi, Sashank J. and Zaheer, Manzil and Sra, Suvrit and Póczos, Barnabás and Bach, Francis R. and Salakhutdinov, Ruslan and Smola, Alexander J.},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2018},
  pages     = {1233-1242},
  url       = {https://mlanthology.org/aistats/2018/reddi2018aistats-generic/}
}