Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization Through Self-Concordance

Abstract

We consider learning methods based on the regularization of a convex empirical risk by a squared Hilbertian norm, a setting that includes linear predictors and non-linear predictors through positive-definite kernels. In order to go beyond the generic analysis leading to convergence rates of the excess risk as $O(1/\sqrt{n})$ from $n$ observations, we assume that the individual losses are self-concordant, that is, their third-order derivatives are bounded by their second-order derivatives. This setting includes least-squares, as well as all generalized linear models such as logistic and softmax regression. For this class of losses, we provide a bias-variance decomposition and show that the assumptions commonly made in least-squares regression, such as the source and capacity conditions, can be adapted to obtain fast non-asymptotic rates of convergence by improving the bias terms, the variance terms or both.

Cite

Text

Marteau-Ferey et al. "Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization Through Self-Concordance." Conference on Learning Theory, 2019.

Markdown

[Marteau-Ferey et al. "Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization Through Self-Concordance." Conference on Learning Theory, 2019.](https://mlanthology.org/colt/2019/marteauferey2019colt-beyond/)

BibTeX

@inproceedings{marteauferey2019colt-beyond,
  title     = {{Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization Through Self-Concordance}},
  author    = {Marteau-Ferey, Ulysse and Ostrovskii, Dmitrii and Bach, Francis and Rudi, Alessandro},
  booktitle = {Conference on Learning Theory},
  year      = {2019},
  pages     = {2294-2340},
  volume    = {99},
  url       = {https://mlanthology.org/colt/2019/marteauferey2019colt-beyond/}
}