Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood

Abstract

Non-asymptotic analysis of quasi-Newton methods have received a lot of attention recently. In particular, several works have established a non-asymptotic superlinear rate of $\mathcal{O}((1/\sqrt{t})^t)$ for the (classic) BFGS method by exploiting the fact that its error of Newton direction approximation approaches zero. Moreover, a greedy variant of the BFGS method was recently proposed which accelerates the convergence of BFGS by directly approximating the Hessian matrix, instead of Newton direction, and achieves a fast local quadratic convergence rate. Alas, the local quadratic convergence of Greedy-BFGS requires way more updates compared to the number of iterations that BFGS requires for a local superlinear rate. This is due to the fact that in Greedy-BFGS the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. In this paper, we close this gap and present a novel BFGS method that has the best of two worlds. More precisely, it leverages the approximation ideas of both BFGS and Greedy-BFGS to properly approximate both the Newton direction and the Hessian matrix. Our theoretical results show that our method out-performs both BFGS and Greedy-BFGS in terms of convergence rate, while it reaches its quadratic convergence rate with fewer steps compared to Greedy-BFGS. Numerical experiments on various datasets also confirm our theoretical findings.

Cite

Text

Jin et al. "Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood." International Conference on Machine Learning, 2022.

Markdown

[Jin et al. "Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/jin2022icml-sharpened/)

BibTeX

@inproceedings{jin2022icml-sharpened,
  title     = {{Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood}},
  author    = {Jin, Qiujiang and Koppel, Alec and Rajawat, Ketan and Mokhtari, Aryan},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {10228-10250},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/jin2022icml-sharpened/}
}