Stability Properties of Empirical Risk Minimization over Donsker Classes
Abstract
We study some stability properties of algorithms which minimize (or almost-minimize) empirical error over Donsker classes of functions. We show that, as the number n of samples grows, the L2-diameter of the set of almost-minimizers of empirical error with tolerance ξ(n)=o(n-1/2) converges to zero in probability. Hence, even in the case of multiple minimizers of expected error, as n increases it becomes less and less likely that adding a sample (or a number of samples) to the training set will result in a large jump to a new hypothesis. Moreover, under some assumptions on the entropy of the class, along with an assumption of Komlos-Major-Tusnady type, we derive a power rate of decay for the diameter of almost-minimizers. This rate, through an application of a uniform ratio limit inequality, is shown to govern the closeness of the expected errors of the almost-minimizers. In fact, under the above assumptions, the expected errors of almost-minimizers become closer with a rate strictly faster than n-1/2.
Cite
Text
Caponnetto and Rakhlin. "Stability Properties of Empirical Risk Minimization over Donsker Classes." Journal of Machine Learning Research, 2006.Markdown
[Caponnetto and Rakhlin. "Stability Properties of Empirical Risk Minimization over Donsker Classes." Journal of Machine Learning Research, 2006.](https://mlanthology.org/jmlr/2006/caponnetto2006jmlr-stability/)BibTeX
@article{caponnetto2006jmlr-stability,
title = {{Stability Properties of Empirical Risk Minimization over Donsker Classes}},
author = {Caponnetto, Andrea and Rakhlin, Alexander},
journal = {Journal of Machine Learning Research},
year = {2006},
pages = {2565-2583},
volume = {7},
url = {https://mlanthology.org/jmlr/2006/caponnetto2006jmlr-stability/}
}