Combined 5 X 2 Cv F Test for Comparing Supervised Classification Learning Algorithms

Abstract

Dietterich (1998) reviews five statistical tests and proposes the 5 × 2 cvt test for determining whether there is a significant difference between the error rates of two classifiers. In our experiments, we noticed that the 5 × 2 cvt test result may vary depending on factors that should not affect the test, and we propose a variant, the combined 5 × 2 cv F test, that combines multiple statistics to get a more robust test. Simulation results show that this combined version of the test has lower type I error and higher power than 5 × 2 cv proper.

Cite

Text

Alpaydin. "Combined 5 X 2 Cv F Test for Comparing Supervised Classification Learning Algorithms." Neural Computation, 1999. doi:10.1162/089976699300016007

Markdown

[Alpaydin. "Combined 5 X 2 Cv F Test for Comparing Supervised Classification Learning Algorithms." Neural Computation, 1999.](https://mlanthology.org/neco/1999/alpaydin1999neco-combined/) doi:10.1162/089976699300016007

BibTeX

@article{alpaydin1999neco-combined,
  title     = {{Combined 5 X 2 Cv F Test for Comparing Supervised Classification Learning Algorithms}},
  author    = {Alpaydin, Ethem},
  journal   = {Neural Computation},
  year      = {1999},
  pages     = {1885-1892},
  doi       = {10.1162/089976699300016007},
  volume    = {11},
  url       = {https://mlanthology.org/neco/1999/alpaydin1999neco-combined/}
}