Exploiting Causal Chains for Domain Generalization
Abstract
Invariant Causal Prediction provides a framework for domain (or out-of-distribution) generalization -- predicated on the assumption of invariant causal mechanisms that are constant across the data distributions of interest. Accordingly, given a sufficient number of distinct training distributions, the Invariant Risk Minimization (IRM) objective was proposed to learn this stable structure. However, recent work has identified the limitations of IRM when extended to data-generating mechanisms that are different from those considered in its formulation. This work considers a chain generative process where domain-specific exogenous factors influence all features -- but the target is free of direct domain-specific influences. We propose a target conditioned representation independence (TCRI) constraint, which enforces the mediative effect of the observed target with respect to the causal chain of latent features we aim to identify. We empirically show a setting where this approach outperforms both Empirical Risk Minimization (ERM) and IRM.
Cite
Text
Salaudeen and Koyejo. "Exploiting Causal Chains for Domain Generalization." NeurIPS 2021 Workshops: DistShift, 2021.Markdown
[Salaudeen and Koyejo. "Exploiting Causal Chains for Domain Generalization." NeurIPS 2021 Workshops: DistShift, 2021.](https://mlanthology.org/neuripsw/2021/salaudeen2021neuripsw-exploiting/)BibTeX
@inproceedings{salaudeen2021neuripsw-exploiting,
title = {{Exploiting Causal Chains for Domain Generalization}},
author = {Salaudeen, Olawale Elijah and Koyejo, Oluwasanmi O},
booktitle = {NeurIPS 2021 Workshops: DistShift},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/salaudeen2021neuripsw-exploiting/}
}