Causal Imitation Learning Under Expert-Observable and Expert-Unobservable Confounding

Abstract

We propose a general framework for causal Imitation Learning (IL) with hidden confounders, which subsumes several existing settings. Our framework accounts for two types of hidden confounders: (a) variables observed by the expert but not by the imitator, and (b) confounding noise hidden from both. By leveraging trajectory histories as instruments, we reformulate causal IL in our framework into a Conditional Moment Restriction (CMR) problem. We propose DML-IL, an algorithm that solves this CMR problem via instrumental variable regression, and upper bound its imitation gap. Empirical evaluation on continuous state-action environments, including Mujoco tasks, demonstrates that DML-IL outperforms existing causal IL baselines.

Cite

Text

Shao et al. "Causal Imitation Learning Under Expert-Observable and Expert-Unobservable Confounding." International Conference on Learning Representations, 2026.

Markdown

[Shao et al. "Causal Imitation Learning Under Expert-Observable and Expert-Unobservable Confounding." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/shao2026iclr-causal/)

BibTeX

@inproceedings{shao2026iclr-causal,
  title     = {{Causal Imitation Learning Under Expert-Observable and Expert-Unobservable Confounding}},
  author    = {Shao, Daqian and Buening, Thomas Kleine and Kwiatkowska, Marta},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/shao2026iclr-causal/}
}