No Unbiased Estimator of the Variance of K-Fold Cross-Validation

Abstract

Most machine learning researchers perform quantitative experiments to estimate generalization error and compare algorithm performances. In order to draw statistically convincing conclusions, it is important to esti- mate the uncertainty of such estimates. This paper studies the estimation of uncertainty around the K-fold cross-validation estimator. The main theorem shows that there exists no universal unbiased estimator of the variance of K-fold cross-validation. An analysis based on the eigende- composition of the covariance matrix of errors helps to better understand the nature of the problem and shows that naive estimators may grossly underestimate variance, as con£rmed by numerical experiments.

Cite

Text

Bengio and Grandvalet. "No Unbiased Estimator of the Variance of K-Fold Cross-Validation." Neural Information Processing Systems, 2003.

Markdown

[Bengio and Grandvalet. "No Unbiased Estimator of the Variance of K-Fold Cross-Validation." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/bengio2003neurips-unbiased/)

BibTeX

@inproceedings{bengio2003neurips-unbiased,
  title     = {{No Unbiased Estimator of the Variance of K-Fold Cross-Validation}},
  author    = {Bengio, Yoshua and Grandvalet, Yves},
  booktitle = {Neural Information Processing Systems},
  year      = {2003},
  pages     = {513-520},
  url       = {https://mlanthology.org/neurips/2003/bengio2003neurips-unbiased/}
}