No Unbiased Estimator of the Variance of K-Fold Cross-Validation
Abstract
Most machine learning researchers perform quantitative experiments to estimate generalization error and compare algorithm performances. In order to draw statistically convincing conclusions, it is important to esti- mate the uncertainty of such estimates. This paper studies the estimation of uncertainty around the K-fold cross-validation estimator. The main theorem shows that there exists no universal unbiased estimator of the variance of K-fold cross-validation. An analysis based on the eigende- composition of the covariance matrix of errors helps to better understand the nature of the problem and shows that naive estimators may grossly underestimate variance, as con£rmed by numerical experiments.
Cite
Text
Bengio and Grandvalet. "No Unbiased Estimator of the Variance of K-Fold Cross-Validation." Neural Information Processing Systems, 2003.Markdown
[Bengio and Grandvalet. "No Unbiased Estimator of the Variance of K-Fold Cross-Validation." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/bengio2003neurips-unbiased/)BibTeX
@inproceedings{bengio2003neurips-unbiased,
title = {{No Unbiased Estimator of the Variance of K-Fold Cross-Validation}},
author = {Bengio, Yoshua and Grandvalet, Yves},
booktitle = {Neural Information Processing Systems},
year = {2003},
pages = {513-520},
url = {https://mlanthology.org/neurips/2003/bengio2003neurips-unbiased/}
}