Bias Plus Variance Decomposition for Zero-One Loss Functions

Abstract

We present a bias-variance decomposition of expected misclassification rate, the most commonly used loss function in supervised classification learning. The bias-variance decomposition for quadratic loss functions is well known and serves as an important tool for analyzing learning algorithms, yet no decomposition was offered for the more commonly used zero-one (misclassification) loss functions until the recent work of Kong & Dietterich (1995) and and Breiman (1996). Their decomposition suffers from some major shortcomings though (e.g., potentially negative variance), which our decomposition avoids. We show that, in practice, the naive frequency-based estimation of the decomposition terms is by itself biased and show how to correct for this bias. We illustrate the decomposition on various algorithms and datasets from the UCI repository.

Cite

Text

Kohavi and Wolpert. "Bias Plus Variance Decomposition for Zero-One Loss Functions." International Conference on Machine Learning, 1996.

Markdown

[Kohavi and Wolpert. "Bias Plus Variance Decomposition for Zero-One Loss Functions." International Conference on Machine Learning, 1996.](https://mlanthology.org/icml/1996/kohavi1996icml-bias/)

BibTeX

@inproceedings{kohavi1996icml-bias,
  title     = {{Bias Plus Variance Decomposition for Zero-One Loss Functions}},
  author    = {Kohavi, Ron and Wolpert, David H.},
  booktitle = {International Conference on Machine Learning},
  year      = {1996},
  pages     = {275-283},
  url       = {https://mlanthology.org/icml/1996/kohavi1996icml-bias/}
}