Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

Abstract

We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The on-line version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative ma(cid:173) trix (Hessian), which does not require to even calculate the Hes(cid:173) sian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters.

Cite

Text

LeCun et al. "Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors." Neural Information Processing Systems, 1992.

Markdown

[LeCun et al. "Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors." Neural Information Processing Systems, 1992.](https://mlanthology.org/neurips/1992/lecun1992neurips-automatic/)

BibTeX

@inproceedings{lecun1992neurips-automatic,
  title     = {{Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors}},
  author    = {LeCun, Yann and Simard, Patrice Y. and Pearlmutter, Barak},
  booktitle = {Neural Information Processing Systems},
  year      = {1992},
  pages     = {156-163},
  url       = {https://mlanthology.org/neurips/1992/lecun1992neurips-automatic/}
}