On the Boosting Pruning Problem

Abstract

Boosting is a powerful method for improving the predictive accuracy of classifiers. The AdaBoost algorithm of Freund and Schapire has been successfully applied to many domains [ 2 , 10 , 12 ] and the combination of AdaBoost with the C4.5 decision tree algorithm has been called the best off-the-shelf learning algorithm in practice. Unfortunately, in some applications, the number of decision trees required by AdaBoost to achieve a reasonable accuracy is enormously large and hence is very space consuming. This problem was first studied by Margineantu and Dietterich [ 7 ], where they proposed an empirical method called Kappa pruning to prune the boosting ensemble of decision trees. The Kappa method did this without sacrificing too much accuracy. In this work-in-progress we propose a potential improvement to the Kappa pruning method and also study the boosting pruning problem from a theoretical perspective. We point out that the boosting pruning problem is intractable even to approximate. Finally, we suggest a margin -based theoretical heuristic for this problem.

Cite

Text

Tamon and Xiang. "On the Boosting Pruning Problem." European Conference on Machine Learning, 2000. doi:10.1007/3-540-45164-1_41

Markdown

[Tamon and Xiang. "On the Boosting Pruning Problem." European Conference on Machine Learning, 2000.](https://mlanthology.org/ecmlpkdd/2000/tamon2000ecml-boosting/) doi:10.1007/3-540-45164-1_41

BibTeX

@inproceedings{tamon2000ecml-boosting,
  title     = {{On the Boosting Pruning Problem}},
  author    = {Tamon, Christino and Xiang, Jie},
  booktitle = {European Conference on Machine Learning},
  year      = {2000},
  pages     = {404-412},
  doi       = {10.1007/3-540-45164-1_41},
  url       = {https://mlanthology.org/ecmlpkdd/2000/tamon2000ecml-boosting/}
}