Asymptotics in Empirical Risk Minimization
Abstract
In this paper, we study a two-category classification problem. We indicate the categories by labels Y=1 and Y=-1. We observe a covariate, or feature, X ∈ X ⊂ ℜd. Consider a collection ha of classifiers indexed by a finite-dimensional parameter a, and the classifier ha* that minimizes the prediction error over this class. The parameter a* is estimated by the empirical risk minimizer ân over the class, where the empirical risk is calculated on a training sample of size n. We apply the Kim Pollard Theorem to show that under certain differentiability assumptions, ân converges to a* with rate n-1/3, and also present the asymptotic distribution of the renormalized estimator. For example, let V0 denote the set of x on which, given X=x, the label Y=1 is more likely (than the label Y=-1). If X is one-dimensional, the set V0 is the union of disjoint intervals. The problem is then to estimate the thresholds of the intervals. We obtain the asymptotic distribution of the empirical risk minimizer when the classifiers have K thresholds, where K is fixed. We furthermore consider an extension to higher-dimensional X, assuming basically that V0 has a smooth boundary in some given parametric class.
Cite
Text
Mohammadi and van de Geer. "Asymptotics in Empirical Risk Minimization." Journal of Machine Learning Research, 2005.Markdown
[Mohammadi and van de Geer. "Asymptotics in Empirical Risk Minimization." Journal of Machine Learning Research, 2005.](https://mlanthology.org/jmlr/2005/mohammadi2005jmlr-asymptotics/)BibTeX
@article{mohammadi2005jmlr-asymptotics,
title = {{Asymptotics in Empirical Risk Minimization}},
author = {Mohammadi, Leila and van de Geer, Sara},
journal = {Journal of Machine Learning Research},
year = {2005},
pages = {2027-2047},
volume = {6},
url = {https://mlanthology.org/jmlr/2005/mohammadi2005jmlr-asymptotics/}
}