The VC-Dimension Versus the Statistical Capacity of Multilayer Networks
Abstract
A general relationship is developed between the VC-dimension and the statistical lower epsilon-capacity which shows that the VC-dimension can be lower bounded (in order) by the statistical lower epsilon-capacity of a network trained with random samples. This relationship explains quan(cid:173) titatively how generalization takes place after memorization, and relates the concept of generalization (consistency) with the capacity of the optimal classifier over a class of classifiers with the same structure and the capacity of the Bayesian classifier. Furthermore, it provides a general methodology to evaluate a lower bound for the VC-dimension of feedforward multilayer neural networks. This general methodology is applied to two types of networks which are important for hardware implementations: two layer (N - 2L - 1) net(cid:173) works with binary weights, integer thresholds for the hidden units and zero threshold for the output unit, and a single neuron ((N - 1) net(cid:173) works) with binary weigths and a zero threshold. Specifically, we obtain OC~L) ::; d2 ::; O(W), and d1 ""' O(N). Here W is the total number of weights of the (N - 2L - 1) networks. d1 and d2 represent the VC(cid:173) dimensions for the (N - 1) and (N - 2L - 1) networks respectively.
Cite
Text
Ji and Psaltis. "The VC-Dimension Versus the Statistical Capacity of Multilayer Networks." Neural Information Processing Systems, 1991.Markdown
[Ji and Psaltis. "The VC-Dimension Versus the Statistical Capacity of Multilayer Networks." Neural Information Processing Systems, 1991.](https://mlanthology.org/neurips/1991/ji1991neurips-vcdimension/)BibTeX
@inproceedings{ji1991neurips-vcdimension,
title = {{The VC-Dimension Versus the Statistical Capacity of Multilayer Networks}},
author = {Ji, Chuanyi and Psaltis, Demetri},
booktitle = {Neural Information Processing Systems},
year = {1991},
pages = {928-935},
url = {https://mlanthology.org/neurips/1991/ji1991neurips-vcdimension/}
}