Measuring the Stability of Feature Selection

Nogueira, Sarah; Brown, Gavin

doi:10.1007/978-3-319-46227-1_28

Measuring the Stability of Feature Selection

Sarah Nogueira, Gavin Brown

ECML-PKDD 2016 pp. 442-457

doi:10.1007/978-3-319-46227-1_28 /ecmlpkdd/2016/nogueira2016ecmlpkdd-measuring/

Abstract

In feature selection algorithms, “stability” is the sensitivity of the chosen feature set to variations in the supplied training data. As such it can be seen as an analogous concept to the statistical variance of a predictor. However unlike variance, there is no unique definition of stability, with numerous proposed measures over 15 years of literature. In this paper, instead of defining a new measure, we start from an axiomatic point of view and identify what properties would be desirable. Somewhat surprisingly, we find that the simple Pearson’s correlation coefficient has all necessary properties, yet has somehow been overlooked in favour of more complex alternatives. Finally, we illustrate how the use of this measure in practice can provide better interpretability and more confidence in the model selection process. The data and software related to this paper are available at https://github.com/nogueirs/ECML2016 .

PDF ECML-PKDD Semantic Scholar

Cite

Text

Nogueira and Brown. "Measuring the Stability of Feature Selection." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_28

Markdown

[Nogueira and Brown. "Measuring the Stability of Feature Selection." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/nogueira2016ecmlpkdd-measuring/) doi:10.1007/978-3-319-46227-1_28

BibTeX

@inproceedings{nogueira2016ecmlpkdd-measuring,
  title     = {{Measuring the Stability of Feature Selection}},
  author    = {Nogueira, Sarah and Brown, Gavin},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2016},
  pages     = {442-457},
  doi       = {10.1007/978-3-319-46227-1_28},
  url       = {https://mlanthology.org/ecmlpkdd/2016/nogueira2016ecmlpkdd-measuring/}
}