A Novel Scalable and Data Efficient Feature Subset Selection Algorithm
Abstract
In this paper, we aim to identify the minimal subset of discrete random variables that is relevant for probabilistic classification in data sets with many variables but few instances. A principled solution to this problem is to determine the Markov boundary of the class variable. Also, we present a novel scalable, data efficient and correct Markov boundary learning algorithm under the so-called faithfulness condition. We report extensive empiric experiments on synthetic and real data sets scaling up to 139,351 variables.
Cite
Text
de Morais and Aussem. "A Novel Scalable and Data Efficient Feature Subset Selection Algorithm." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87481-2_20Markdown
[de Morais and Aussem. "A Novel Scalable and Data Efficient Feature Subset Selection Algorithm." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/demorais2008ecmlpkdd-novel/) doi:10.1007/978-3-540-87481-2_20BibTeX
@inproceedings{demorais2008ecmlpkdd-novel,
title = {{A Novel Scalable and Data Efficient Feature Subset Selection Algorithm}},
author = {de Morais, Sergio Rodrigues and Aussem, Alex},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2008},
pages = {298-312},
doi = {10.1007/978-3-540-87481-2_20},
url = {https://mlanthology.org/ecmlpkdd/2008/demorais2008ecmlpkdd-novel/}
}