Non-Uniform Smoothness for Gradient Descent

Abstract

The analysis of gradient descent-type methods typically relies on the Lipschitz continuity of the objective gradient. This generally requires an expensive hyperparameter tuning process to appropriately calibrate a stepsize for a given problem. In this work we introduce a local first-order smoothness oracle (LFSO) which generalizes the Lipschitz continuous gradients smoothness condition and is applicable to any twice-differentiable function. We show that this oracle can encode all relevant problem information for tuning stepsizes for a suitably modified gradient descent method and give global and local convergence results. We also show that LFSOs in this modified first-order method can yield global linear convergence rates for non-strongly convex problems with extremely flat minima, and thus improve over the lower bound on rates achievable by general (accelerated) first-order methods.

Cite

Text

Berahas et al. "Non-Uniform Smoothness for Gradient Descent." Transactions on Machine Learning Research, 2024.

Markdown

[Berahas et al. "Non-Uniform Smoothness for Gradient Descent." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/berahas2024tmlr-nonuniform/)

BibTeX

@article{berahas2024tmlr-nonuniform,
  title     = {{Non-Uniform Smoothness for Gradient Descent}},
  author    = {Berahas, Albert S. and Roberts, Lindon and Roosta, Fred},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/berahas2024tmlr-nonuniform/}
}