A Bayesian Approach for Comparing Cross-Validated Algorithms on Multiple Data Sets

Abstract

We present a Bayesian approach for making statistical inference about the accuracy (or any other score) of two competing algorithms which have been assessed via cross-validation on multiple data sets. The approach is constituted by two pieces. The first is a novel correlated Bayesian $t$ t test for the analysis of the cross-validation results on a single data set which accounts for the correlation due to the overlapping training sets. The second piece merges the posterior probabilities computed by the Bayesian correlated $t$ t test on the different data sets to make inference on multiple data sets. It does so by adopting a Poisson-binomial model. The inferences on multiple data sets account for the different uncertainty of the cross-validation results on the different data sets. It is the first test able to achieve this goal. It is generally more powerful than the signed-rank test if ten runs of cross-validation are performed, as it is anyway generally recommended.

Cite

Text

Corani and Benavoli. "A Bayesian Approach for Comparing Cross-Validated Algorithms on Multiple Data Sets." Machine Learning, 2015. doi:10.1007/S10994-015-5486-Z

Markdown

[Corani and Benavoli. "A Bayesian Approach for Comparing Cross-Validated Algorithms on Multiple Data Sets." Machine Learning, 2015.](https://mlanthology.org/mlj/2015/corani2015mlj-bayesian/) doi:10.1007/S10994-015-5486-Z

BibTeX

@article{corani2015mlj-bayesian,
  title     = {{A Bayesian Approach for Comparing Cross-Validated Algorithms on Multiple Data Sets}},
  author    = {Corani, Giorgio and Benavoli, Alessio},
  journal   = {Machine Learning},
  year      = {2015},
  pages     = {285-304},
  doi       = {10.1007/S10994-015-5486-Z},
  volume    = {100},
  url       = {https://mlanthology.org/mlj/2015/corani2015mlj-bayesian/}
}