A Progressive Batching L-BFGS Method for Machine Learning

Abstract

The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.

Cite

Text

Bollapragada et al. "A Progressive Batching L-BFGS Method for Machine Learning." International Conference on Machine Learning, 2018.

Markdown

[Bollapragada et al. "A Progressive Batching L-BFGS Method for Machine Learning." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/bollapragada2018icml-progressive/)

BibTeX

@inproceedings{bollapragada2018icml-progressive,
  title     = {{A Progressive Batching L-BFGS Method for Machine Learning}},
  author    = {Bollapragada, Raghu and Nocedal, Jorge and Mudigere, Dheevatsa and Shi, Hao-Jun and Tang, Ping Tak Peter},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {620-629},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/bollapragada2018icml-progressive/}
}