Direct Conditional Probability Density Estimation with Sparse Feature Selection
Abstract
Regression is a fundamental problem in statistical data analysis, which aims at estimating the conditional mean of output given input. However, regression is not informative enough if the conditional probability density is multi-modal, asymmetric, and heteroscedastic. To overcome this limitation, various estimators of conditional densities themselves have been developed, and a kernel-based approach called least-squares conditional density estimation (LS-CDE) was demonstrated to be promising. However, LS-CDE still suffers from large estimation error if input contains many irrelevant features. In this paper, we therefore propose an extension of LS-CDE called sparse additive CDE (SA-CDE), which allows automatic feature selection in CDE. SA-CDE applies kernel LS-CDE to each input feature in an additive manner and penalizes the whole solution by a group-sparse regularizer. We also give a subgradient-based optimization method for SA-CDE training that scales well to high-dimensional large data sets. Through experiments with benchmark and humanoid robot transition datasets, we demonstrate the usefulness of SA-CDE in noisy CDE problems.
Cite
Text
Shiga et al. "Direct Conditional Probability Density Estimation with Sparse Feature Selection." Machine Learning, 2015. doi:10.1007/S10994-014-5472-XMarkdown
[Shiga et al. "Direct Conditional Probability Density Estimation with Sparse Feature Selection." Machine Learning, 2015.](https://mlanthology.org/mlj/2015/shiga2015mlj-direct/) doi:10.1007/S10994-014-5472-XBibTeX
@article{shiga2015mlj-direct,
title = {{Direct Conditional Probability Density Estimation with Sparse Feature Selection}},
author = {Shiga, Motoki and Tangkaratt, Voot and Sugiyama, Masashi},
journal = {Machine Learning},
year = {2015},
pages = {161-182},
doi = {10.1007/S10994-014-5472-X},
volume = {100},
url = {https://mlanthology.org/mlj/2015/shiga2015mlj-direct/}
}