A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS)
Abstract
Feature selection reduces the complexity of high-dimensional datasets and helps to gain insights into systematic variation in the data. These aspects are essential in domains that rely on model interpretability, such as life sciences. We propose a (U)ser-Guided (Bay)esian Framework for (F)eature (S)election, UBayFS, an ensemble feature selection technique embedded in a Bayesian statistical framework. Our generic approach considers two sources of information: data and domain knowledge. From data, we build an ensemble of feature selectors, described by a multinomial likelihood model. Using domain knowledge, the user guides UBayFS by weighting features and penalizing feature blocks or combinations, implemented via a Dirichlet-type prior distribution. Hence, the framework combines three main aspects: ensemble feature selection, expert knowledge, and side constraints. Our experiments demonstrate that UBayFS (a) allows for a balanced trade-off between user knowledge and data observations and (b) achieves accurate and robust results.
Cite
Text
Jenul et al. "A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS)." Machine Learning, 2022. doi:10.1007/S10994-022-06221-9Markdown
[Jenul et al. "A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS)." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/jenul2022mlj-userguided/) doi:10.1007/S10994-022-06221-9BibTeX
@article{jenul2022mlj-userguided,
title = {{A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS)}},
author = {Jenul, Anna and Schrunner, Stefan and Pilz, Jürgen and Tomic, Oliver},
journal = {Machine Learning},
year = {2022},
pages = {3897-3923},
doi = {10.1007/S10994-022-06221-9},
volume = {111},
url = {https://mlanthology.org/mlj/2022/jenul2022mlj-userguided/}
}