Boosted Classification Trees and Class Probability/Quantile Estimation
Abstract
The standard by which binary classifiers are usually judged, misclassification error, assumes equal costs of misclassifying the two classes or, equivalently, classifying at the 1/2 quantile of the conditional class probability function P[y=1|x]. Boosted classification trees are known to perform quite well for such problems. In this article we consider the use of standard, off-the-shelf boosting for two more general problems: 1) classification with unequal costs or, equivalently, classification at quantiles other than 1/2, and 2) estimation of the conditional class probability function P[y=1|x]. We first examine whether the latter problem, estimation of P[y=1|x], can be solved with LogitBoost, and with AdaBoost when combined with a natural link function. The answer is negative: both approaches are often ineffective because they overfit P[y=1|x] even though they perform well as classifiers. A major negative point of the present article is the disconnect between class probability estimation and classification.
Cite
Text
Mease et al. "Boosted Classification Trees and Class Probability/Quantile Estimation." Journal of Machine Learning Research, 2007.Markdown
[Mease et al. "Boosted Classification Trees and Class Probability/Quantile Estimation." Journal of Machine Learning Research, 2007.](https://mlanthology.org/jmlr/2007/mease2007jmlr-boosted/)BibTeX
@article{mease2007jmlr-boosted,
title = {{Boosted Classification Trees and Class Probability/Quantile Estimation}},
author = {Mease, David and Wyner, Abraham J. and Buja, Andreas},
journal = {Journal of Machine Learning Research},
year = {2007},
pages = {409-439},
volume = {8},
url = {https://mlanthology.org/jmlr/2007/mease2007jmlr-boosted/}
}