Boosting Based on Divide and Merge
Abstract
InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.
Cite
Text
Takimoto et al. "Boosting Based on Divide and Merge." International Conference on Algorithmic Learning Theory, 2004. doi:10.1007/978-3-540-30215-5_11Markdown
[Takimoto et al. "Boosting Based on Divide and Merge." International Conference on Algorithmic Learning Theory, 2004.](https://mlanthology.org/alt/2004/takimoto2004alt-boosting/) doi:10.1007/978-3-540-30215-5_11BibTeX
@inproceedings{takimoto2004alt-boosting,
title = {{Boosting Based on Divide and Merge}},
author = {Takimoto, Eiji and Koya, Syuhei and Maruoka, Akira},
booktitle = {International Conference on Algorithmic Learning Theory},
year = {2004},
pages = {127-141},
doi = {10.1007/978-3-540-30215-5_11},
url = {https://mlanthology.org/alt/2004/takimoto2004alt-boosting/}
}