On the Stability of Feature Selection in the Presence of Feature Correlations

Abstract

Feature selection is central to modern data science. The ‘stability’ of a feature selection algorithm refers to the sensitivity of its choices to small changes in training data. This is, in effect, the robustness of the chosen features. This paper considers the estimation of stability when we expect strong pairwise correlations, otherwise known as feature redundancy . We demonstrate that existing measures are inappropriate here, as they systematically underestimate the true stability, giving an overly pessimistic view of a feature set. We propose a new statistical measure which overcomes this issue, and generalises previous work.

Cite

Text

Sechidis et al. "On the Stability of Feature Selection in the Presence of Feature Correlations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019. doi:10.1007/978-3-030-46150-8_20

Markdown

[Sechidis et al. "On the Stability of Feature Selection in the Presence of Feature Correlations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019.](https://mlanthology.org/ecmlpkdd/2019/sechidis2019ecmlpkdd-stability/) doi:10.1007/978-3-030-46150-8_20

BibTeX

@inproceedings{sechidis2019ecmlpkdd-stability,
  title     = {{On the Stability of Feature Selection in the Presence of Feature Correlations}},
  author    = {Sechidis, Konstantinos and Papangelou, Konstantinos and Nogueira, Sarah and Weatherall, James and Brown, Gavin},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2019},
  pages     = {327-342},
  doi       = {10.1007/978-3-030-46150-8_20},
  url       = {https://mlanthology.org/ecmlpkdd/2019/sechidis2019ecmlpkdd-stability/}
}