Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values
Abstract
We propose a novel class of stochastic, adaptive methods for minimizing self-concordant functions which can be expressed as an expected value. These methods generate an estimate of the true objective function by taking the empirical mean over a sample drawn at each step, making the problem tractable. The use of adaptive step sizes eliminates the need for the user to supply a step size. Methods in this class include extensions of gradient descent (GD) and BFGS. We show that, given a suitable amount of sampling, the stochastic adaptive GD attains linear convergence in expectation, and with further sampling, the stochastic adaptive BFGS attains R-superlinear convergence. We present experiments showing that these methods compare favorably to SGD.
Cite
Text
Zhou et al. "Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values." International Conference on Machine Learning, 2017.Markdown
[Zhou et al. "Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/zhou2017icml-stochastic/)BibTeX
@inproceedings{zhou2017icml-stochastic,
title = {{Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values}},
author = {Zhou, Chaoxu and Gao, Wenbo and Goldfarb, Donald},
booktitle = {International Conference on Machine Learning},
year = {2017},
pages = {4150-4159},
volume = {70},
url = {https://mlanthology.org/icml/2017/zhou2017icml-stochastic/}
}