Learning Invariant Representations with Missing Data
Abstract
Spurious correlations allow flexible models to predict well during training but poorly on related test populations. Recent work has shown that models that satisfy particular independencies involving correlation-inducing nuisance variables have guarantees on their test performance. Enforcing such independencies requires nuisances to be observed during training. However, nuisances, such as demographics or image background labels, are often missing. Enforcing independence on just the observed data does not imply independence on the entire population. Here we derive MMD estimators used for invariance objectives under missing nuisances. On simulations and clinical data, optimizing through these estimates achieves test performance similar to using estimators that make use of the full data.
Cite
Text
Goldstein et al. "Learning Invariant Representations with Missing Data." NeurIPS 2021 Workshops: DistShift, 2021.Markdown
[Goldstein et al. "Learning Invariant Representations with Missing Data." NeurIPS 2021 Workshops: DistShift, 2021.](https://mlanthology.org/neuripsw/2021/goldstein2021neuripsw-learning/)BibTeX
@inproceedings{goldstein2021neuripsw-learning,
title = {{Learning Invariant Representations with Missing Data}},
author = {Goldstein, Mark and Jacobsen, Joern-Henrik and Chau, Olina and Saporta, Adriel and Puli, Aahlad Manas and Ranganath, Rajesh and Miller, Andrew},
booktitle = {NeurIPS 2021 Workshops: DistShift},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/goldstein2021neuripsw-learning/}
}