Using Curvature Information for Fast Stochastic Search
Abstract
We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes effective use of cur(cid:173) vature information, requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear back(cid:173) prop networks.
Cite
Text
Orr and Leen. "Using Curvature Information for Fast Stochastic Search." Neural Information Processing Systems, 1996.Markdown
[Orr and Leen. "Using Curvature Information for Fast Stochastic Search." Neural Information Processing Systems, 1996.](https://mlanthology.org/neurips/1996/orr1996neurips-using/)BibTeX
@inproceedings{orr1996neurips-using,
title = {{Using Curvature Information for Fast Stochastic Search}},
author = {Orr, Genevieve B. and Leen, Todd K.},
booktitle = {Neural Information Processing Systems},
year = {1996},
pages = {606-612},
url = {https://mlanthology.org/neurips/1996/orr1996neurips-using/}
}