Convex and Non-Convex Optimization Under Generalized Smoothness

Haochuan Li, Jian Qian, Yi Tian, Alexander Rakhlin, Ali Jadbabaie

NeurIPS 2023

/neurips/2023/li2023neurips-convex/

Abstract

Classical analysis of convex and non-convex optimization methods often requires the Lipschitz continuity of the gradient, which limits the analysis to functions bounded by quadratics. Recent work relaxed this requirement to a non-uniform smoothness condition with the Hessian norm bounded by an affine function of the gradient norm, and proved convergence in the non-convex setting via gradient clipping, assuming bounded noise. In this paper, we further generalize this non-uniform smoothness condition and develop a simple, yet powerful analysis technique that bounds the gradients along the trajectory, thereby leading to stronger results for both convex and non-convex optimization problems. In particular, we obtain the classical convergence rates for (stochastic) gradient descent and Nesterov's accelerated gradient method in the convex and/or non-convex setting under this general smoothness condition. The new analysis approach does not require gradient clipping and allows heavy-tailed noise with bounded variance in the stochastic setting.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Li et al. "Convex and Non-Convex Optimization Under Generalized Smoothness." Neural Information Processing Systems, 2023.

Markdown

[Li et al. "Convex and Non-Convex Optimization Under Generalized Smoothness." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/li2023neurips-convex/)

BibTeX

@inproceedings{li2023neurips-convex,
  title     = {{Convex and Non-Convex Optimization Under Generalized Smoothness}},
  author    = {Li, Haochuan and Qian, Jian and Tian, Yi and Rakhlin, Alexander and Jadbabaie, Ali},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/li2023neurips-convex/}
}