On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

Abstract

This paper describes a method of dogleg trust-region steps, or re(cid:173) stricted Levenberg-Marquardt steps, based on a projection pro(cid:173) cess onto the Krylov subspaces for neural networks nonlinear least squares problems. In particular, the linear conjugate gradient (CG) method works as the inner iterative algorithm for solving the lin(cid:173) earized Gauss-Newton normal equation, whereas the outer nonlin(cid:173) ear algorithm repeatedly takes so-called "Krylov-dogleg" steps, re(cid:173) lying only on matrix-vector multiplication without explicitly form(cid:173) ing the Jacobian matrix or the Gauss-Newton model Hessian. That is, our iterative dogleg algorithm can reduce both operational counts and memory space by a factor of O(n) (the number of pa(cid:173) rameters) in comparison with a direct linear-equation solver. This memory-less property is useful for large-scale problems.

Cite

Text

Mizutani and Demmel. "On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems." Neural Information Processing Systems, 2000.

Markdown

[Mizutani and Demmel. "On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/mizutani2000neurips-iterative/)

BibTeX

@inproceedings{mizutani2000neurips-iterative,
  title     = {{On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems}},
  author    = {Mizutani, Eiji and Demmel, James},
  booktitle = {Neural Information Processing Systems},
  year      = {2000},
  pages     = {605-611},
  url       = {https://mlanthology.org/neurips/2000/mizutani2000neurips-iterative/}
}