Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction

Abstract

The increasing size and complexity of scientific data could dramatically enhance discovery and prediction for basic scientific applications, e.g., neuroscience, genetics, systems biology, etc. Realizing this potential, however, requires novel statistical analysis methods that are both interpretable and predictive. We introduce the Union of Intersections (UoI) method, a flexible, modular, and scalable framework for enhanced model selection and estimation. The method performs model selection and model estimation through intersection and union operations, respectively. We show that UoI can satisfy the bi-criteria of low-variance and nearly unbiased estimation of a small number of interpretable features, while maintaining high-quality prediction accuracy. We perform extensive numerical investigation to evaluate a UoI algorithm ($UoI_{Lasso}$) on synthetic and real data. In doing so, we demonstrate the extraction of interpretable functional networks from human electrophysiology recordings as well as the accurate prediction of phenotypes from genotype-phenotype data with reduced features. We also show (with the $UoI_{L1Logistic}$ and $UoI_{CUR}$ variants of the basic framework) improved prediction parsimony for classification and matrix factorization on several benchmark biomedical data sets. These results suggest that methods based on UoI framework could improve interpretation and prediction in data-driven discovery across scientific fields.

Cite

Text

Bouchard et al. "Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction." Neural Information Processing Systems, 2017.

Markdown

[Bouchard et al. "Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/bouchard2017neurips-union/)

BibTeX

@inproceedings{bouchard2017neurips-union,
  title     = {{Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction}},
  author    = {Bouchard, Kristofer and Bujan, Alejandro and Roosta, Fred and Ubaru, Shashanka and Prabhat, Mr. and Snijders, Antoine and Mao, Jian-Hua and Chang, Edward and Mahoney, Michael W. and Bhattacharya, Sharmodeep},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {1078-1086},
  url       = {https://mlanthology.org/neurips/2017/bouchard2017neurips-union/}
}