Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?
Abstract
A statistical theory for overtraining is proposed. The analysis treats realizable stochastic neural networks, trained with Kullback(cid:173) Leibler loss in the asymptotic case. It is shown that the asymptotic gain in the generalization error is small if we perform early stop(cid:173) ping, even if we have access to the optimal stopping time. Consider(cid:173) ing cross-validation stopping we answer the question: In what ratio the examples should be divided into training and testing sets in or(cid:173) der to obtain the optimum performance. In the non-asymptotic region cross-validated early stopping always decreases the general(cid:173) ization error. Our large scale simulations done on a CM5 are in nice agreement with our analytical findings.
Cite
Text
Amari et al. "Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?." Neural Information Processing Systems, 1995.Markdown
[Amari et al. "Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?." Neural Information Processing Systems, 1995.](https://mlanthology.org/neurips/1995/amari1995neurips-statistical/)BibTeX
@inproceedings{amari1995neurips-statistical,
title = {{Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?}},
author = {Amari, Shun-ichi and Murata, Noboru and Müller, Klaus-Robert and Finke, Michael and Yang, Howard Hua},
booktitle = {Neural Information Processing Systems},
year = {1995},
pages = {176-182},
url = {https://mlanthology.org/neurips/1995/amari1995neurips-statistical/}
}