The Effective Size of a Neural Network: A Principal Component Approach

Abstract

Often when learning from data, one attaches a penalty term to a standard error term in an attempt to prefer simple models and prevent overfitting. Current penalty terms for neural networks, however, often do not take into account weight interaction. This is a critical drawback since the effective number of parameters in a network usually differs dramatically from the total number of possible parameters. In this paper, we present a penalty term that uses Principal Component Analysis to help detect functional redundancy in a neural network. Results show that our new algorithm gives a much more accurate estimate of network complexity than do standard approaches. As a result, our new term should be able to improve techniques that make use of a penalty term, such as weight decay, weight pruning, feature selection, Bayesian, and prediction-risk techniques. 1 Introduction Overfitting is a well-studied phenomenon [ Geman et al., 1992; Holder, 1991; Weigend, 1993 ] where a learning algorithm ...

Cite

Text

Opitz. "The Effective Size of a Neural Network: A Principal Component Approach." International Conference on Machine Learning, 1997.

Markdown

[Opitz. "The Effective Size of a Neural Network: A Principal Component Approach." International Conference on Machine Learning, 1997.](https://mlanthology.org/icml/1997/opitz1997icml-effective/)

BibTeX

@inproceedings{opitz1997icml-effective,
  title     = {{The Effective Size of a Neural Network: A Principal Component Approach}},
  author    = {Opitz, David W.},
  booktitle = {International Conference on Machine Learning},
  year      = {1997},
  pages     = {263-271},
  url       = {https://mlanthology.org/icml/1997/opitz1997icml-effective/}
}