Unsupervised Feature Selection with Ensemble Learning
Abstract
In this paper, we show that the way internal estimates are used to measure variable importance in Random Forests are also applicable to feature selection in unsupervised learning. We propose a new method called Random Cluster Ensemble (RCE for short), that estimates the out-of-bag feature importance from an ensemble of partitions. Each partition is constructed using a different bootstrap sample and a random subset of the features. We provide empirical results on nineteen benchmark data sets indicating that RCE, boosted with a recursive feature elimination scheme (RFE) (Guyon and Elisseeff, Journal of Machine Learning Research, 3:1157–1182, 2003 ), can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art supervised and unsupervised algorithms, with a very limited subset of features. The method shows promise to deal with very large domains. All results, datasets and algorithms are available on line ( http://perso.univ-lyon1.fr/haytham.elghazel/RCE.zip ).
Cite
Text
Elghazel and Aussem. "Unsupervised Feature Selection with Ensemble Learning." Machine Learning, 2015. doi:10.1007/S10994-013-5337-8Markdown
[Elghazel and Aussem. "Unsupervised Feature Selection with Ensemble Learning." Machine Learning, 2015.](https://mlanthology.org/mlj/2015/elghazel2015mlj-unsupervised/) doi:10.1007/S10994-013-5337-8BibTeX
@article{elghazel2015mlj-unsupervised,
title = {{Unsupervised Feature Selection with Ensemble Learning}},
author = {Elghazel, Haytham and Aussem, Alex},
journal = {Machine Learning},
year = {2015},
pages = {157-180},
doi = {10.1007/S10994-013-5337-8},
volume = {98},
url = {https://mlanthology.org/mlj/2015/elghazel2015mlj-unsupervised/}
}