From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confounding
Abstract
One of the central goals of causal machine learning is the accurate estimation of heterogeneous treatment effects from observational data. In recent years, meta-learning has emerged as a flexible, model-agnostic paradigm for estimating conditional average treatment effects (CATE) using any supervised model. This paper examines the performance of meta-learners when the confounding variables are expressed in text. Through synthetic data experiments, we show that learners using pre-trained text representations of confounders, in addition to tabular background variables, achieve improved CATE estimates compared to those relying solely on the tabular variables, particularly when sufficient data is available. However, due to the entangled nature of the text embeddings, these models do not fully match the performance of meta-learners with perfect confounder knowledge. These findings highlight both the potential and the limitations of pre-trained text representations for causal inference and open up interesting avenues for future research.
Cite
Text
Arno et al. "From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confounding." NeurIPS 2024 Workshops: CRL, 2024.Markdown
[Arno et al. "From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confounding." NeurIPS 2024 Workshops: CRL, 2024.](https://mlanthology.org/neuripsw/2024/arno2024neuripsw-text/)BibTeX
@inproceedings{arno2024neuripsw-text,
title = {{From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confounding}},
author = {Arno, Henri and Rabaey, Paloma and Demeester, Thomas},
booktitle = {NeurIPS 2024 Workshops: CRL},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/arno2024neuripsw-text/}
}