From Stochastic Mixability to Fast Rates

Abstract

Empirical risk minimization (ERM) is a fundamental learning rule for statistical learning problems where the data is generated according to some unknown distribution $\mathsf{P}$ and returns a hypothesis $f$ chosen from a fixed class $\mathcal{F}$ with small loss $\ell$. In the parametric setting, depending upon $(\ell, \mathcal{F},\mathsf{P})$ ERM can have slow $(1/\sqrt{n})$ or fast $(1/n)$ rates of convergence of the excess risk as a function of the sample size $n$. There exist several results that give sufficient conditions for fast rates in terms of joint properties of $\ell$, $\mathcal{F}$, and $\mathsf{P}$, such as the margin condition and the Bernstein condition. In the non-statistical prediction with expert advice setting, there is an analogous slow and fast rate phenomenon, and it is entirely characterized in terms of the mixability of the loss $\ell$ (there being no role there for $\mathcal{F}$ or $\mathsf{P}$). The notion of stochastic mixability builds a bridge between these two models of learning, reducing to classical mixability in a special case. The present paper presents a direct proof of fast rates for ERM in terms of stochastic mixability of $(\ell,\mathcal{F}, \mathsf{P})$, and in so doing provides new insight into the fast-rates phenomenon. The proof exploits an old result of Kemperman on the solution to the general moment problem. We also show a partial converse that suggests a characterization of fast rates for ERM in terms of stochastic mixability is possible.

Cite

Text

Mehta and Williamson. "From Stochastic Mixability to Fast Rates." Neural Information Processing Systems, 2014.

Markdown

[Mehta and Williamson. "From Stochastic Mixability to Fast Rates." Neural Information Processing Systems, 2014.](https://mlanthology.org/neurips/2014/mehta2014neurips-stochastic/)

BibTeX

@inproceedings{mehta2014neurips-stochastic,
  title     = {{From Stochastic Mixability to Fast Rates}},
  author    = {Mehta, Nishant A and Williamson, Robert C.},
  booktitle = {Neural Information Processing Systems},
  year      = {2014},
  pages     = {1197-1205},
  url       = {https://mlanthology.org/neurips/2014/mehta2014neurips-stochastic/}
}