History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Abstract
Variance-reduced algorithms, although achieve great theoretical performance, can run slowly in practice due to the periodic gradient estimation with a large batch of data. Batch-size adaptation thus arises as a promising approach to accelerate such algorithms. However, existing schemes either apply prescribed batch-size adaption rule or exploit the information along optimization path via additional backtracking and condition verification steps. In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients. We further theoretically show that such a scheme substantially reduces the overall complexity for popular variance-reduced algorithms SVRG and SARAH/SPIDER for both conventional nonconvex optimization and reinforcement learning problems. To this end, we develop a new convergence analysis framework to handle the dependence of the batch size on history stochastic gradients. Extensive experiments validate the effectiveness of the proposed batch-size adaptation scheme.
Cite
Text
Ji et al. "History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms." International Conference on Machine Learning, 2020.Markdown
[Ji et al. "History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/ji2020icml-historygradient/)BibTeX
@inproceedings{ji2020icml-historygradient,
title = {{History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms}},
author = {Ji, Kaiyi and Wang, Zhe and Weng, Bowen and Zhou, Yi and Zhang, Wei and Liang, Yingbin},
booktitle = {International Conference on Machine Learning},
year = {2020},
pages = {4762-4772},
volume = {119},
url = {https://mlanthology.org/icml/2020/ji2020icml-historygradient/}
}