Streaming Feature Selection Using IIC

Abstract

In Streaming Feature Selection (SFS), new features are sequentially considered for addition to a predictive model. When the space of potential features is large, SFS offers many advantages over methods in which all features are assumed to be known in advance. Features can be generated dynamically, focusing the search for new features on promising subspaces, and overfitting can be controlled by dynamically adjusting the threshold for adding features to the model. We present a new, adaptive complexity penalty, the Information Investing Criterion (IIC), which uses an efficient coding of features added, and not added, to the model to dynamically adjust the threshold on the entropy reduction required for adding a new feature. Streaming Feature Selection with IIC gives strong guarantees against overfitting. In contrast, standard penalty methods such as BIC or RIC always drastically over- or under-fit in the limit of infinite numbers of non-predictive features. Empirical

Cite

Text

Ungar et al. "Streaming Feature Selection Using IIC." Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005.

Markdown

[Ungar et al. "Streaming Feature Selection Using IIC." Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005.](https://mlanthology.org/aistats/2005/ungar2005aistats-streaming/)

BibTeX

@inproceedings{ungar2005aistats-streaming,
  title     = {{Streaming Feature Selection Using IIC}},
  author    = {Ungar, Lyle H. and Zhou, Jing and Foster, Dean P. and Stine, Bob A.},
  booktitle = {Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics},
  year      = {2005},
  pages     = {357-364},
  volume    = {R5},
  url       = {https://mlanthology.org/aistats/2005/ungar2005aistats-streaming/}
}