Statistical Mechanics of Learning in a Large Committee Machine

Abstract

We use statistical mechanics to study generalization in large com(cid:173) mittee machines. For an architecture with nonoverlapping recep(cid:173) tive fields a replica calculation yields the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls off asymptotically inversely proportional to Q, the number of training examples per weight. For binary weights we find a discontinuous transition from poor to perfect generalization followed by a wide region of metastability. Broken replica symmetry is found within this region at low temperatures. For a fully connected architecture the generalization error is cal(cid:173) culated within the annealed approximation. For both binary and continuous weights we find transitions from a symmetric state to one with specialized hidden units, accompanied by discontinuous drops in the generalization error.

Cite

Text

Schwarze and Hertz. "Statistical Mechanics of Learning in a Large Committee Machine." Neural Information Processing Systems, 1992.

Markdown

[Schwarze and Hertz. "Statistical Mechanics of Learning in a Large Committee Machine." Neural Information Processing Systems, 1992.](https://mlanthology.org/neurips/1992/schwarze1992neurips-statistical/)

BibTeX

@inproceedings{schwarze1992neurips-statistical,
  title     = {{Statistical Mechanics of Learning in a Large Committee Machine}},
  author    = {Schwarze, Holm and Hertz, John A.},
  booktitle = {Neural Information Processing Systems},
  year      = {1992},
  pages     = {523-530},
  url       = {https://mlanthology.org/neurips/1992/schwarze1992neurips-statistical/}
}