Meta-Analysis of Heterogeneous Data: Integrative Sparse Regression in High-Dimensions

Abstract

We consider the task of meta-analysis in high-dimensional settings in which the data sources are similar but non-identical. To borrow strength across such heterogeneous datasets, we introduce a global parameter that emphasizes interpretability and statistical efficiency in the presence of heterogeneity. We also propose a one-shot estimator of the global parameter that preserves the anonymity of the data sources and converges at a rate that depends on the size of the combined dataset. For high-dimensional linear model settings, we demonstrate the superiority of our identification restrictions in adapting to a previously seen data distribution as well as predicting for a new/unseen data distribution. Finally, we demonstrate the benefits of our approach on a large-scale drug treatment dataset involving several different cancer cell-lines.

Cite

Text

Maity et al. "Meta-Analysis of Heterogeneous Data: Integrative Sparse Regression in High-Dimensions." Journal of Machine Learning Research, 2022.

Markdown

[Maity et al. "Meta-Analysis of Heterogeneous Data: Integrative Sparse Regression in High-Dimensions." Journal of Machine Learning Research, 2022.](https://mlanthology.org/jmlr/2022/maity2022jmlr-metaanalysis/)

BibTeX

@article{maity2022jmlr-metaanalysis,
  title     = {{Meta-Analysis of Heterogeneous Data: Integrative Sparse Regression in High-Dimensions}},
  author    = {Maity, Subha and Sun, Yuekai and Banerjee, Moulinath},
  journal   = {Journal of Machine Learning Research},
  year      = {2022},
  pages     = {1-50},
  volume    = {23},
  url       = {https://mlanthology.org/jmlr/2022/maity2022jmlr-metaanalysis/}
}