QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Abstract

The learning rate is a critical hyperparameter for deep learning tasks since it determines the extent to which the model parameters are adjusted during the learning course. However, the choice of learning rates typically depends on empirical judgment, which may not result in satisfactory outcomes without intensive try-and-error experiments. In this study, we propose a novel learning rate adaptation scheme called QLABGrad. Without any user-specified hyperparameter, QLABGrad automatically determines the learning rate by optimizing the quadratic loss approximation-based (QLAB) function for a given gradient descent direction, where only one extra forward propagation is required. We theoretically prove the convergence of QLABGrad under the smooth Lipschitz condition on the loss function. Experiment results on multiple architectures, including MLP, CNN, and ResNet, on MNIST, CIFAR10, and ImageNet datasets, demonstrate that QLABGrad outperforms widely adopted schemes for deep learning.

Cite

Text

Fu and Wu. "QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I11.29095

Markdown

[Fu and Wu. "QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/fu2024aaai-qlabgrad/) doi:10.1609/AAAI.V38I11.29095

BibTeX

@inproceedings{fu2024aaai-qlabgrad,
  title     = {{QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning}},
  author    = {Fu, Minghan and Wu, Fang-Xiang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {12072-12081},
  doi       = {10.1609/AAAI.V38I11.29095},
  url       = {https://mlanthology.org/aaai/2024/fu2024aaai-qlabgrad/}
}