Non-Uniform Smoothness for Gradient Descent
Abstract
The analysis of gradient descent-type methods typically relies on the Lipschitz continuity of the objective gradient. This generally requires an expensive hyperparameter tuning process to appropriately calibrate a stepsize for a given problem. In this work we introduce a local first-order smoothness oracle (LFSO) which generalizes the Lipschitz continuous gradients smoothness condition and is applicable to any twice-differentiable function. We show that this oracle can encode all relevant problem information for tuning stepsizes for a suitably modified gradient descent method and give global and local convergence results. We also show that LFSOs in this modified first-order method can yield global linear convergence rates for non-strongly convex problems with extremely flat minima, and thus improve over the lower bound on rates achievable by general (accelerated) first-order methods.
Cite
Text
Berahas et al. "Non-Uniform Smoothness for Gradient Descent." Transactions on Machine Learning Research, 2024.Markdown
[Berahas et al. "Non-Uniform Smoothness for Gradient Descent." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/berahas2024tmlr-nonuniform/)BibTeX
@article{berahas2024tmlr-nonuniform,
title = {{Non-Uniform Smoothness for Gradient Descent}},
author = {Berahas, Albert S. and Roberts, Lindon and Roosta, Fred},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/berahas2024tmlr-nonuniform/}
}