Concentration Inequalities for the Missing Mass and for Histogram Rule Error
Abstract
This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem. The concentration inequality for histogram rule error is motivated by the desire to construct a new class of bounds on the generalization error of decision trees.
Cite
Text
McAllester and Ortiz. "Concentration Inequalities for the Missing Mass and for Histogram Rule Error." Journal of Machine Learning Research, 2003.Markdown
[McAllester and Ortiz. "Concentration Inequalities for the Missing Mass and for Histogram Rule Error." Journal of Machine Learning Research, 2003.](https://mlanthology.org/jmlr/2003/mcallester2003jmlr-concentration/)BibTeX
@article{mcallester2003jmlr-concentration,
title = {{Concentration Inequalities for the Missing Mass and for Histogram Rule Error}},
author = {McAllester, David and Ortiz, Luis},
journal = {Journal of Machine Learning Research},
year = {2003},
pages = {895-911},
volume = {4},
url = {https://mlanthology.org/jmlr/2003/mcallester2003jmlr-concentration/}
}