Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors
Abstract
We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The on-line version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative ma(cid:173) trix (Hessian), which does not require to even calculate the Hes(cid:173) sian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters.
Cite
Text
LeCun et al. "Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors." Neural Information Processing Systems, 1992.Markdown
[LeCun et al. "Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors." Neural Information Processing Systems, 1992.](https://mlanthology.org/neurips/1992/lecun1992neurips-automatic/)BibTeX
@inproceedings{lecun1992neurips-automatic,
title = {{Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors}},
author = {LeCun, Yann and Simard, Patrice Y. and Pearlmutter, Barak},
booktitle = {Neural Information Processing Systems},
year = {1992},
pages = {156-163},
url = {https://mlanthology.org/neurips/1992/lecun1992neurips-automatic/}
}