Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?

Abstract

A statistical theory for overtraining is proposed. The analysis treats realizable stochastic neural networks, trained with Kullback(cid:173) Leibler loss in the asymptotic case. It is shown that the asymptotic gain in the generalization error is small if we perform early stop(cid:173) ping, even if we have access to the optimal stopping time. Consider(cid:173) ing cross-validation stopping we answer the question: In what ratio the examples should be divided into training and testing sets in or(cid:173) der to obtain the optimum performance. In the non-asymptotic region cross-validated early stopping always decreases the general(cid:173) ization error. Our large scale simulations done on a CM5 are in nice agreement with our analytical findings.

Cite

Text

Amari et al. "Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?." Neural Information Processing Systems, 1995.

Markdown

[Amari et al. "Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?." Neural Information Processing Systems, 1995.](https://mlanthology.org/neurips/1995/amari1995neurips-statistical/)

BibTeX

@inproceedings{amari1995neurips-statistical,
  title     = {{Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?}},
  author    = {Amari, Shun-ichi and Murata, Noboru and Müller, Klaus-Robert and Finke, Michael and Yang, Howard Hua},
  booktitle = {Neural Information Processing Systems},
  year      = {1995},
  pages     = {176-182},
  url       = {https://mlanthology.org/neurips/1995/amari1995neurips-statistical/}
}