Active Sampling for Detecting Irrelevant Features
Abstract
The general approach for automatically driving data collection using information from previously acquired data is called active learning. Traditional active learning addresses the problem of choosing the unlabeled examples for which the class labels are queried with the goal of learning a classifier. In contrast we address the problem of active feature sampling for detecting useless features. We propose a strategy to actively sample the values of new features on class-labeled examples, with the objective of feature relevance assessment. We derive an active feature sampling algorithm from an information theoretic and statistical formulation of the problem. We present experimental results on synthetic, UCI and real world datasets to demonstrate that our active sampling algorithm can provide accurate estimates of feature relevance with lower data acquisition costs than random sampling and other previously proposed sampling algorithms.
Cite
Text
Veeramachaneni et al. "Active Sampling for Detecting Irrelevant Features." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143965Markdown
[Veeramachaneni et al. "Active Sampling for Detecting Irrelevant Features." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/veeramachaneni2006icml-active/) doi:10.1145/1143844.1143965BibTeX
@inproceedings{veeramachaneni2006icml-active,
title = {{Active Sampling for Detecting Irrelevant Features}},
author = {Veeramachaneni, Sriharsha and Olivetti, Emanuele and Avesani, Paolo},
booktitle = {International Conference on Machine Learning},
year = {2006},
pages = {961-968},
doi = {10.1145/1143844.1143965},
url = {https://mlanthology.org/icml/2006/veeramachaneni2006icml-active/}
}