Convergence Guarantees for a Class of Non-Convex and Non-Smooth Optimization Problems

Abstract

We consider the problem of finding critical points of functions that are non-convex and non-smooth. Studying a fairly broad class of such problems, we analyze the behavior of three gradient-based methods (gradient descent, proximal update, and Frank-Wolfe update). For each of these methods, we establish rates of convergence for general problems, and also prove faster rates for continuous sub-analytic functions. We also show that our algorithms can escape strict saddle points for a class of non-smooth functions, thereby generalizing known results for smooth functions. Our analysis leads to a simplification of the popular CCCP algorithm, used for optimizing functions that can be written as a difference of two convex functions. Our simplified algorithm retains all the convergence properties of CCCP, along with a significantly lower cost per iteration. We illustrate our methods and theory via applications to the problems of best subset selection, robust estimation, mixture density estimation, and shape-from-shading reconstruction.

Cite

Text

Khamaru and Wainwright. "Convergence Guarantees for a Class of Non-Convex and Non-Smooth Optimization Problems." Journal of Machine Learning Research, 2019.

Markdown

[Khamaru and Wainwright. "Convergence Guarantees for a Class of Non-Convex and Non-Smooth Optimization Problems." Journal of Machine Learning Research, 2019.](https://mlanthology.org/jmlr/2019/khamaru2019jmlr-convergence/)

BibTeX

@article{khamaru2019jmlr-convergence,
  title     = {{Convergence Guarantees for a Class of Non-Convex and Non-Smooth Optimization Problems}},
  author    = {Khamaru, Koulik and Wainwright, Martin J.},
  journal   = {Journal of Machine Learning Research},
  year      = {2019},
  pages     = {1-52},
  volume    = {20},
  url       = {https://mlanthology.org/jmlr/2019/khamaru2019jmlr-convergence/}
}