Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons

Abstract

The natural gradient descent method is applied to train an n-m-1 multilayer perceptron. Based on an efficient scheme to represent the Fisher information matrix for an n-m-1 stochastic multilayer perceptron, a new algorithm is proposed to calculate the natural gradient without inverting the Fisher information matrix explicitly. When the input dimension n is much larger than the number of hidden neurons m, the time complexity of computing the natural gradient is O(n).

Cite

Text

Yang and Amari. "Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons." Neural Computation, 1998. doi:10.1162/089976698300017007

Markdown

[Yang and Amari. "Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons." Neural Computation, 1998.](https://mlanthology.org/neco/1998/yang1998neco-complexity/) doi:10.1162/089976698300017007

BibTeX

@article{yang1998neco-complexity,
  title     = {{Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons}},
  author    = {Yang, Howard Hua and Amari, Shun-ichi},
  journal   = {Neural Computation},
  year      = {1998},
  pages     = {2137-2157},
  doi       = {10.1162/089976698300017007},
  volume    = {10},
  url       = {https://mlanthology.org/neco/1998/yang1998neco-complexity/}
}