Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons
Abstract
The natural gradient descent method is applied to train an n-m-1 multilayer perceptron. Based on an efficient scheme to represent the Fisher information matrix for an n-m-1 stochastic multilayer perceptron, a new algorithm is proposed to calculate the natural gradient without inverting the Fisher information matrix explicitly. When the input dimension n is much larger than the number of hidden neurons m, the time complexity of computing the natural gradient is O(n).
Cite
Text
Yang and Amari. "Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons." Neural Computation, 1998. doi:10.1162/089976698300017007Markdown
[Yang and Amari. "Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons." Neural Computation, 1998.](https://mlanthology.org/neco/1998/yang1998neco-complexity/) doi:10.1162/089976698300017007BibTeX
@article{yang1998neco-complexity,
title = {{Complexity Issues in Natural Gradient Descent Method for Training Multi-Layer Perceptrons}},
author = {Yang, Howard Hua and Amari, Shun-ichi},
journal = {Neural Computation},
year = {1998},
pages = {2137-2157},
doi = {10.1162/089976698300017007},
volume = {10},
url = {https://mlanthology.org/neco/1998/yang1998neco-complexity/}
}