Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier

Abstract

The simple Bayesian classifier (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and can be optimal even when this assumption is violated by a wide margin. The key to this finding lies in the distinction between classification and probability estimation: correct classification can be achieved even when the probability estimates used contain large errors. We show that the previously-assumed region of optimality of the SBC is a second-order infinitesimal fraction of the actual one. This is followed by the derivation of several necessary and several sufficient conditions for the optimality of the SBC. For example, the SBC is optimal for learning arbitrary conjunctions and disjunction...

Cite

Text

Domingos and Pazzani. "Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier." International Conference on Machine Learning, 1996.

Markdown

[Domingos and Pazzani. "Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier." International Conference on Machine Learning, 1996.](https://mlanthology.org/icml/1996/domingos1996icml-beyond/)

BibTeX

@inproceedings{domingos1996icml-beyond,
  title     = {{Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier}},
  author    = {Domingos, Pedro M. and Pazzani, Michael J.},
  booktitle = {International Conference on Machine Learning},
  year      = {1996},
  pages     = {105-112},
  url       = {https://mlanthology.org/icml/1996/domingos1996icml-beyond/}
}