The Effective Size of a Neural Network: A Principal Component Approach
Abstract
Often when learning from data, one attaches a penalty term to a standard error term in an attempt to prefer simple models and prevent overfitting. Current penalty terms for neural networks, however, often do not take into account weight interaction. This is a critical drawback since the effective number of parameters in a network usually differs dramatically from the total number of possible parameters. In this paper, we present a penalty term that uses Principal Component Analysis to help detect functional redundancy in a neural network. Results show that our new algorithm gives a much more accurate estimate of network complexity than do standard approaches. As a result, our new term should be able to improve techniques that make use of a penalty term, such as weight decay, weight pruning, feature selection, Bayesian, and prediction-risk techniques. 1 Introduction Overfitting is a well-studied phenomenon [ Geman et al., 1992; Holder, 1991; Weigend, 1993 ] where a learning algorithm ...
Cite
Text
Opitz. "The Effective Size of a Neural Network: A Principal Component Approach." International Conference on Machine Learning, 1997.Markdown
[Opitz. "The Effective Size of a Neural Network: A Principal Component Approach." International Conference on Machine Learning, 1997.](https://mlanthology.org/icml/1997/opitz1997icml-effective/)BibTeX
@inproceedings{opitz1997icml-effective,
title = {{The Effective Size of a Neural Network: A Principal Component Approach}},
author = {Opitz, David W.},
booktitle = {International Conference on Machine Learning},
year = {1997},
pages = {263-271},
url = {https://mlanthology.org/icml/1997/opitz1997icml-effective/}
}