Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian

Abstract

The online incremental gradient (or backpropagation) algorithm is widely considered to be the fastest method for solving large-scale neural-network (NN) learning problems. In contrast, we show that an appropriately implemented iterative batch-mode (or block-mode) learning method can be much faster. For example, it is three times faster in the UCI letter classification problem (26 outputs, 16,000 data items, 6,066 parameters with a two-hidden-layer multilayer perceptron) and 353 times faster in a nonlinear regression problem arising in color recipe prediction (10 outputs, 1,000 data items, 2,210 parameters with a neuro-fuzzy modular network). The three principal innovative ingredients in our algorithm are the following: First, we use scaled trust-region regularization with inner-outer it- eration to solve the associated “overdetermined” nonlinear least squares problem, where the inner iteration performs a truncated (or inexact) Newton method. Second, we employ Pearlmutter’s implicit sparse Hessian matrix-vector multiply algorithm to con- struct the Krylov subspaces used to solve for the truncated New- ton update. Third, we exploit sparsity (for preconditioning) in the matrices resulting from the NNs having many outputs.

Cite

Text

Mizutani and Demmel. "Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian." Neural Information Processing Systems, 2003.

Markdown

[Mizutani and Demmel. "Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/mizutani2003neurips-iterative/)

BibTeX

@inproceedings{mizutani2003neurips-iterative,
  title     = {{Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian}},
  author    = {Mizutani, Eiji and Demmel, James},
  booktitle = {Neural Information Processing Systems},
  year      = {2003},
  pages     = {209-216},
  url       = {https://mlanthology.org/neurips/2003/mizutani2003neurips-iterative/}
}