Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine

Abstract

We wish to train a feedforward projective-sigmoidal neural network (MLP) on breast cancer outcomes data missing both binary and continuous input variable values. A Gaussian-Bernoulli mixture model is trained on the data (using EM). It then performs stochastic imputation (filling in) of the missing values, as a preprocessor to the MLP. In order to compare predictive accuracy when the training data are complete vs. incomplete/imputed, we use only complete cases from a natural data set, but artificially remove 80% of their input data values. Very little difference is observed in the comparison, suggesting that the mixture model is quite effective here, despite the fact that more than 99% of the casesfmstances had had some missing value(s). The mixture model can be used both for output/outcome prediction by a trained MLP and for the training process itself.

Cite

Text

Rosen and Burke. "Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine." Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, 1997.

Markdown

[Rosen and Burke. "Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine." Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, 1997.](https://mlanthology.org/aistats/1997/rosen1997aistats-applying/)

BibTeX

@inproceedings{rosen1997aistats-applying,
  title     = {{Applying a Gaussian-Bernoulli Mixture Model Network to Binary and Continuous Missing Data in Medicine}},
  author    = {Rosen, David B. and Burke, Harry B.},
  booktitle = {Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics},
  year      = {1997},
  pages     = {429-436},
  volume    = {R1},
  url       = {https://mlanthology.org/aistats/1997/rosen1997aistats-applying/}
}