Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine
Abstract
We wish to train a feedforward projective-sigmoidal neural network (MLP) on breast cancer outcomes data missing both binary and continuous input variable values. A Gaussian-Bernoulli mixture model is trained on the data (using EM). It then performs stochastic imputation (filling in) of the missing values, as a preprocessor to the MLP. In order to compare predictive accuracy when the training data are complete vs. incomplete/imputed, we use only complete cases from a natural data set, but artificially remove 80% of their input data values. Very little difference is observed in the comparison, suggesting that the mixture model is quite effective here, despite the fact that more than 99% of the casesfmstances had had some missing value(s). The mixture model can be used both for output/outcome prediction by a trained MLP and for the training process itself.
Cite
Text
Rosen and Burke. "Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine." Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, 1997.Markdown
[Rosen and Burke. "Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine." Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, 1997.](https://mlanthology.org/aistats/1997/rosen1997aistats-applying/)BibTeX
@inproceedings{rosen1997aistats-applying,
title = {{Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine}},
author = {Rosen, David B. and Burke, Harry B.},
booktitle = {Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics},
year = {1997},
pages = {429-436},
volume = {R1},
url = {https://mlanthology.org/aistats/1997/rosen1997aistats-applying/}
}