A Novel Scalable and Data Efficient Feature Subset Selection Algorithm

de Morais, Sergio Rodrigues; Aussem, Alex

doi:10.1007/978-3-540-87481-2_20

A Novel Scalable and Data Efficient Feature Subset Selection Algorithm

Sergio Rodrigues de Morais, Alex Aussem

ECML-PKDD 2008 pp. 298-312

doi:10.1007/978-3-540-87481-2_20 /ecmlpkdd/2008/demorais2008ecmlpkdd-novel/

Abstract

In this paper, we aim to identify the minimal subset of discrete random variables that is relevant for probabilistic classification in data sets with many variables but few instances. A principled solution to this problem is to determine the Markov boundary of the class variable. Also, we present a novel scalable, data efficient and correct Markov boundary learning algorithm under the so-called faithfulness condition. We report extensive empiric experiments on synthetic and real data sets scaling up to 139,351 variables.

PDF ECML-PKDD Semantic Scholar

Cite

Text

de Morais and Aussem. "A Novel Scalable and Data Efficient Feature Subset Selection Algorithm." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87481-2_20

Markdown

[de Morais and Aussem. "A Novel Scalable and Data Efficient Feature Subset Selection Algorithm." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/demorais2008ecmlpkdd-novel/) doi:10.1007/978-3-540-87481-2_20

BibTeX

@inproceedings{demorais2008ecmlpkdd-novel,
  title     = {{A Novel Scalable and Data Efficient Feature Subset Selection Algorithm}},
  author    = {de Morais, Sergio Rodrigues and Aussem, Alex},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2008},
  pages     = {298-312},
  doi       = {10.1007/978-3-540-87481-2_20},
  url       = {https://mlanthology.org/ecmlpkdd/2008/demorais2008ecmlpkdd-novel/}
}