MSL: An Adaptive Momentem-Based Stochastic Line-Search Framework
Abstract
Various adaptive step sizes have been proposed recently to reduce the amount of tedious manual tuning. A popular example is back-tracking line-search based on a stochastic Armijo condition. But the success of this strategy relies crucially on the search direction being a descent direction. Importantly, this condition is violated by both SGD with momentum (SGDM) and Adam, which are common choices in deep-net training. Adaptively choosing the step size in this setting is thus non- trivial and less explored despite its practical relevance. In this work, we propose two frameworks, namely, momentum correction and restart, that allow the use of stochastic line-search in conjunction with a generalized Armijo condition, and apply them to both SGDM and Adam. We empirically verify that the proposed algorithms are robust to the choice of the momentum parameter and other hyperparameters.
Cite
Text
Fan et al. "MSL: An Adaptive Momentem-Based Stochastic Line-Search Framework." NeurIPS 2023 Workshops: OPT, 2023.Markdown
[Fan et al. "MSL: An Adaptive Momentem-Based Stochastic Line-Search Framework." NeurIPS 2023 Workshops: OPT, 2023.](https://mlanthology.org/neuripsw/2023/fan2023neuripsw-msl/)BibTeX
@inproceedings{fan2023neuripsw-msl,
title = {{MSL: An Adaptive Momentem-Based Stochastic Line-Search Framework}},
author = {Fan, Chen and Vaswani, Sharan and Thrampoulidis, Christos and Schmidt, Mark},
booktitle = {NeurIPS 2023 Workshops: OPT},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/fan2023neuripsw-msl/}
}