Consistent Feature Selection for Pattern Recognition in Polynomial Time

Abstract

We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALL-RELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomial-time algorithms. We also prove that ALL-RELEVANT is much harder than MINIMAL-OPTIMAL and propose two consistent, polynomial-time algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.

Cite

Text

Nilsson et al. "Consistent Feature Selection for Pattern Recognition in Polynomial Time." Journal of Machine Learning Research, 2007.

Markdown

[Nilsson et al. "Consistent Feature Selection for Pattern Recognition in Polynomial Time." Journal of Machine Learning Research, 2007.](https://mlanthology.org/jmlr/2007/nilsson2007jmlr-consistent/)

BibTeX

@article{nilsson2007jmlr-consistent,
  title     = {{Consistent Feature Selection for Pattern Recognition in Polynomial Time}},
  author    = {Nilsson, Roland and Peña, José M. and Björkegren, Johan and Tegnér, Jesper},
  journal   = {Journal of Machine Learning Research},
  year      = {2007},
  pages     = {589-612},
  volume    = {8},
  url       = {https://mlanthology.org/jmlr/2007/nilsson2007jmlr-consistent/}
}