History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

Abstract

Variance-reduced algorithms, although achieve great theoretical performance, can run slowly in practice due to the periodic gradient estimation with a large batch of data. Batch-size adaptation thus arises as a promising approach to accelerate such algorithms. However, existing schemes either apply prescribed batch-size adaption rule or exploit the information along optimization path via additional backtracking and condition verification steps. In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients. We further theoretically show that such a scheme substantially reduces the overall complexity for popular variance-reduced algorithms SVRG and SARAH/SPIDER for both conventional nonconvex optimization and reinforcement learning problems. To this end, we develop a new convergence analysis framework to handle the dependence of the batch size on history stochastic gradients. Extensive experiments validate the effectiveness of the proposed batch-size adaptation scheme.

Cite

Text

Ji et al. "History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms." International Conference on Machine Learning, 2020.

Markdown

[Ji et al. "History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/ji2020icml-historygradient/)

BibTeX

@inproceedings{ji2020icml-historygradient,
  title     = {{History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms}},
  author    = {Ji, Kaiyi and Wang, Zhe and Weng, Bowen and Zhou, Yi and Zhang, Wei and Liang, Yingbin},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {4762-4772},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/ji2020icml-historygradient/}
}