Recurrent Networks: Second Order Properties and Pruning
Abstract
Second order properties of cost functions for recurrent networks are investigated. We analyze a layered fully recurrent architecture, the virtue of this architecture is that it features the conventional feedforward architecture as a special case. A detailed description of recursive computation of the full Hessian of the network cost func(cid:173) tion is provided. We discuss the possibility of invoking simplifying approximations of the Hessian and show how weight decays iron the cost function and thereby greatly assist training. We present tenta(cid:173) tive pruning results, using Hassibi et al.'s Optimal Brain Surgeon, demonstrating that recurrent networks can construct an efficient internal memory.
Cite
Text
Pedersen and Hansen. "Recurrent Networks: Second Order Properties and Pruning." Neural Information Processing Systems, 1994.Markdown
[Pedersen and Hansen. "Recurrent Networks: Second Order Properties and Pruning." Neural Information Processing Systems, 1994.](https://mlanthology.org/neurips/1994/pedersen1994neurips-recurrent/)BibTeX
@inproceedings{pedersen1994neurips-recurrent,
title = {{Recurrent Networks: Second Order Properties and Pruning}},
author = {Pedersen, Morten With and Hansen, Lars Kai},
booktitle = {Neural Information Processing Systems},
year = {1994},
pages = {673-680},
url = {https://mlanthology.org/neurips/1994/pedersen1994neurips-recurrent/}
}