Prediction-Powered Causal Inferences
Abstract
In many scientific experiments, the data annotating cost constraints the pace for testing novel hypotheses. Yet, modern machine learning pipelines offer a promising solution—provided their predictions yield correct conclusions. We focus on Prediction-Powered Causal Inferences (PPCI), i.e., estimating the treatment effect in an unlabeled target experiment, relying on training data with the same outcome annotated but potentially different treatment or effect modifiers. We first show that conditional calibration guarantees valid PPCI at population level. Then, we introduce a sufficient representation constraint transferring validity across experiments, which we propose to enforce in practice in Deconfounded Empirical Risk Minimization, our new model-agnostic training objective. We validate our method on synthetic and real-world scientific data, solving impossible problem instances for Empirical Risk Minimization even with standard invariance constraints. In particular, for the first time, we achieve valid causal inference on a scientific experiment with complex recording and no human annotations, fine-tuning a foundational model on our similar annotated experiment.
Cite
Text
Cadei et al. "Prediction-Powered Causal Inferences." Advances in Neural Information Processing Systems, 2025.Markdown
[Cadei et al. "Prediction-Powered Causal Inferences." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/cadei2025neurips-predictionpowered/)BibTeX
@inproceedings{cadei2025neurips-predictionpowered,
title = {{Prediction-Powered Causal Inferences}},
author = {Cadei, Riccardo and Demirel, Ilker and De Bartolomeis, Piersilvio and Lindorfer, Lukas and Cremer, Sylvia and Schmid, Cordelia and Locatello, Francesco},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/cadei2025neurips-predictionpowered/}
}