Maximizing the Margin with Boosting
Abstract
AdaBoost produces a linear combination of weak hypotheses. It has been observed that the generalization error of the algorithm continues to improve even after all examples are classified correctly by the current linear combination, i.e. by a hyperplane in feature space spanned by the weak hypotheses. The improvement is attributed to the experimental observation that the distances (margins) of the examples to the separating hyperplane are increasing even when the training error is already zero, that is all examples are on the correct side of the hyperplane. We give an iterative version of AdaBoost that explicitly maximizes the minimum margin of the examples. We bound the number of iterations and the number of hypotheses used in the final linear combination which approximates the maximum margin hyperplane with a certain precision. Our modified algorithm essentially retains the exponential convergence properties of AdaBoost and our result does not depend on the size of the hypothesis class.
Cite
Text
Rätsch and Warmuth. "Maximizing the Margin with Boosting." Annual Conference on Computational Learning Theory, 2002. doi:10.1007/3-540-45435-7_23Markdown
[Rätsch and Warmuth. "Maximizing the Margin with Boosting." Annual Conference on Computational Learning Theory, 2002.](https://mlanthology.org/colt/2002/ratsch2002colt-maximizing/) doi:10.1007/3-540-45435-7_23BibTeX
@inproceedings{ratsch2002colt-maximizing,
title = {{Maximizing the Margin with Boosting}},
author = {Rätsch, Gunnar and Warmuth, Manfred K.},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2002},
pages = {334-350},
doi = {10.1007/3-540-45435-7_23},
url = {https://mlanthology.org/colt/2002/ratsch2002colt-maximizing/}
}