Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach

Abstract

The problem of selecting a subset of relevant features in a potentially overwhelming quantity of data is classic and found in many branches of science. Examples in computer vision, text processing and more recently bio-informatics are abundant. In text classification tasks, for example, it is not uncommon to have 104 to 107 features of the size of the vocabulary containing word frequency counts, with the expectation that only a small fraction of them are relevant. Typical examples include the automatic sorting of URLs into a web directory and the detection of spam email.

Cite

Text

Wolf and Shashua. "Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach." Journal of Machine Learning Research, 2005.

Markdown

[Wolf and Shashua. "Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach." Journal of Machine Learning Research, 2005.](https://mlanthology.org/jmlr/2005/wolf2005jmlr-feature/)

BibTeX

@article{wolf2005jmlr-feature,
  title     = {{Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach}},
  author    = {Wolf, Lior and Shashua, Amnon},
  journal   = {Journal of Machine Learning Research},
  year      = {2005},
  pages     = {1855-1887},
  volume    = {6},
  url       = {https://mlanthology.org/jmlr/2005/wolf2005jmlr-feature/}
}