Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features

Abstract

We study the use of fractional norms for regularisation in supervised learning from high dimensional data, in conditions of a large number of irrelevant features, focusing on logistic regression. We develop a variational method for parameter estimation, and show an equivalence between two approximations recently proposed in the statistics literature. Building on previous work by A.Ng, we show the fractional norm regularised logistic regression enjoys a sample complexity that grows logarithmically with the data dimensions and polynomially with the number of relevant dimensions. In addition, extensive empirical testing indicates that fractional-norm regularisation is more suitable than L1 in cases when the number of relevant features is very small, and works very well despite a large number of irrelevant features.

Cite

Text

Kabán and Durrant. "Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87479-9_56

Markdown

[Kabán and Durrant. "Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/kaban2008ecmlpkdd-learning/) doi:10.1007/978-3-540-87479-9_56

BibTeX

@inproceedings{kaban2008ecmlpkdd-learning,
  title     = {{Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features}},
  author    = {Kabán, Ata and Durrant, Robert J.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2008},
  pages     = {580-596},
  doi       = {10.1007/978-3-540-87479-9_56},
  url       = {https://mlanthology.org/ecmlpkdd/2008/kaban2008ecmlpkdd-learning/}
}