Bayesian Invariance Modeling of Multi-Environment Data
Abstract
Peters et al. (2016) introduced the problem of invariant modeling. In this problem, we observe feature/outcome data from multiple environments and our goal is to identify a set of invariant features, those that maintain a stable predictive relationship with the outcome. Identifying such features is important for robust generalization to new environments and for uncovering causal mechanisms. While previous methods primarily tackle this problem through hypothesis testing or regularized optimization, we take a Bayesian approach. We develop a probabilistic model of multi-environment data where the indices of the invariant features are encoded as a latent variable. Under the data-generating assumptions as Peters et al. (2016), we show that posterior inference in our model targets the true invariant features. We prove that this posterior is consistent and we provide theoretical results about the posterior contraction rate. In particular, we show that, under a certain metric, greater heterogeneity among environments leads to a faster contraction of the posterior. When the number of features is large, we design an efficient variational inference algorithm to approximate the posterior. In both simulations and real-world data, we show that Bayesian invariance is more accurate and scalable than existing approaches.
Cite
Text
Wu et al. "Bayesian Invariance Modeling of Multi-Environment Data." ICLR 2025 Workshops: SCSL, 2025.Markdown
[Wu et al. "Bayesian Invariance Modeling of Multi-Environment Data." ICLR 2025 Workshops: SCSL, 2025.](https://mlanthology.org/iclrw/2025/wu2025iclrw-bayesian/)BibTeX
@inproceedings{wu2025iclrw-bayesian,
title = {{Bayesian Invariance Modeling of Multi-Environment Data}},
author = {Wu, Luhuan and Yin, Mingzhang and Wang, Yixin and Cunningham, John Patrick and Blei, David},
booktitle = {ICLR 2025 Workshops: SCSL},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/wu2025iclrw-bayesian/}
}