Escape Saddle Points by a Simple Gradient-Descent Based Algorithm

Abstract

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$, it outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}(\log n/\epsilon^{1.75})$ iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with $\tilde{O}(\log^4 n/\epsilon^{2})$ or $\tilde{O}(\log^6 n/\epsilon^{1.75})$ iterations, our algorithm is polynomially better in terms of $\log n$ and matches their complexities in terms of $1/\epsilon$. For the stochastic setting, our algorithm outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}(\log^{2} n/\epsilon^{4})$ iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in $\log n$ compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.

Cite

Text

Zhang and Li. "Escape Saddle Points by a Simple Gradient-Descent Based Algorithm." Neural Information Processing Systems, 2021.

Markdown

[Zhang and Li. "Escape Saddle Points by a Simple Gradient-Descent Based Algorithm." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/zhang2021neurips-escape/)

BibTeX

@inproceedings{zhang2021neurips-escape,
  title     = {{Escape Saddle Points by a Simple Gradient-Descent Based Algorithm}},
  author    = {Zhang, Chenyi and Li, Tongyang},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/zhang2021neurips-escape/}
}