Probabilistic Multi-Task Feature Selection
Abstract
Recently, some variants of the $l_1$ norm, particularly matrix norms such as the $l_{1,2}$ and $l_{1,\infty}$ norms, have been widely used in multi-task learning, compressed sensing and other related areas to enforce sparsity via joint regularization. In this paper, we unify the $l_{1,2}$ and $l_{1,\infty}$ norms by considering a family of $l_{1,q}$ norms for $1 < q\le\infty$ and study the problem of determining the most appropriate sparsity enforcing norm to use in the context of multi-task feature selection. Using the generalized normal distribution, we provide a probabilistic interpretation of the general multi-task feature selection problem using the $l_{1,q}$ norm. Based on this probabilistic interpretation, we develop a probabilistic model using the noninformative Jeffreys prior. We also extend the model to learn and exploit more general types of pairwise relationships between tasks. For both versions of the model, we devise expectation-maximization~(EM) algorithms to learn all model parameters, including $q$, automatically. Experiments have been conducted on two cancer classification applications using microarray gene expression data.
Cite
Text
Zhang et al. "Probabilistic Multi-Task Feature Selection." Neural Information Processing Systems, 2010.Markdown
[Zhang et al. "Probabilistic Multi-Task Feature Selection." Neural Information Processing Systems, 2010.](https://mlanthology.org/neurips/2010/zhang2010neurips-probabilistic/)BibTeX
@inproceedings{zhang2010neurips-probabilistic,
title = {{Probabilistic Multi-Task Feature Selection}},
author = {Zhang, Yu and Yeung, Dit-Yan and Xu, Qian},
booktitle = {Neural Information Processing Systems},
year = {2010},
pages = {2559-2567},
url = {https://mlanthology.org/neurips/2010/zhang2010neurips-probabilistic/}
}