Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian
Abstract
The online incremental gradient (or backpropagation) algorithm is widely considered to be the fastest method for solving large-scale neural-network (NN) learning problems. In contrast, we show that an appropriately implemented iterative batch-mode (or block-mode) learning method can be much faster. For example, it is three times faster in the UCI letter classification problem (26 outputs, 16,000 data items, 6,066 parameters with a two-hidden-layer multilayer perceptron) and 353 times faster in a nonlinear regression problem arising in color recipe prediction (10 outputs, 1,000 data items, 2,210 parameters with a neuro-fuzzy modular network). The three principal innovative ingredients in our algorithm are the following: First, we use scaled trust-region regularization with inner-outer it- eration to solve the associated “overdetermined” nonlinear least squares problem, where the inner iteration performs a truncated (or inexact) Newton method. Second, we employ Pearlmutter’s implicit sparse Hessian matrix-vector multiply algorithm to con- struct the Krylov subspaces used to solve for the truncated New- ton update. Third, we exploit sparsity (for preconditioning) in the matrices resulting from the NNs having many outputs.
Cite
Text
Mizutani and Demmel. "Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian." Neural Information Processing Systems, 2003.Markdown
[Mizutani and Demmel. "Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/mizutani2003neurips-iterative/)BibTeX
@inproceedings{mizutani2003neurips-iterative,
title = {{Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian}},
author = {Mizutani, Eiji and Demmel, James},
booktitle = {Neural Information Processing Systems},
year = {2003},
pages = {209-216},
url = {https://mlanthology.org/neurips/2003/mizutani2003neurips-iterative/}
}