Towards Causal Foundation Model: On Duality Between Optimal Balancing and Attention

Abstract

Foundation models have brought changes to the landscape of machine learning, demonstrating sparks of human-level intelligence across a diverse array of tasks. However, a gap persists in complex tasks such as causal inference, primarily due to challenges associated with intricate reasoning steps and high numerical precision requirements. In this work, we take a first step towards building causally-aware foundation models for treatment effect estimations. We propose a novel, theoretically justified method called Causal Inference with Attention (CInA), which utilizes multiple unlabeled datasets to perform self-supervised causal learning, and subsequently enables zero-shot causal inference on unseen tasks with new data. This is based on our theoretical results that demonstrate the primal-dual connection between optimal covariate balancing and self-attention, facilitating zero-shot causal inference through the final layer of a trained transformer-type architecture. We demonstrate empirically that CInA effectively generalizes to out-of-distribution datasets and various real-world datasets, matching or even surpassing traditional per-dataset methodologies. These results provide compelling evidence that our method has the potential to serve as a stepping stone for the development of causal foundation models.

Cite

Text

Zhang et al. "Towards Causal Foundation Model: On Duality Between Optimal Balancing and Attention." International Conference on Machine Learning, 2024.

Markdown

[Zhang et al. "Towards Causal Foundation Model: On Duality Between Optimal Balancing and Attention." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/zhang2024icml-causal-a/)

BibTeX

@inproceedings{zhang2024icml-causal-a,
  title     = {{Towards Causal Foundation Model: On Duality Between Optimal Balancing and Attention}},
  author    = {Zhang, Jiaqi and Jennings, Joel and Hilmkil, Agrin and Pawlowski, Nick and Zhang, Cheng and Ma, Chao},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {59042-59065},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/zhang2024icml-causal-a/}
}