Teaching Invariance Using Privileged Mediation Information
Abstract
The performance of deep neural networks often deteriorates in out-of-distribution settings due to relying on easy-to-learn but unreliable spurious associations known as shortcuts. Recent work attempting to mitigate shortcut learning relies on a priori knowledge of the shortcuts and invariance penalties, which are difficult to enforce in practice. To address these limitations, we study two causally-motivated methods that efficiently learn models that are invariant to shortcuts by leveraging privileged mediation information. We first adapt concept bottleneck models (CBMs) to incorporate mediators -- intermediate variables that lie on the causal path between input features and target labels -- resulting in a straightforward extension we call Mediator Bottleneck Models (MBMs). One drawback of this method is that it requires two potentially large models at inference time. To address this issue, we propose Teaching Invariance using Privileged Mediation Information (TIPMI), a novel approach which distills knowledge from a counterfactually invariant teacher trained using privileged mediation information to a student predictor that uses non-privileged, easy-to-collect features. We analyze the theoretical properties of both estimators, showing that they promote invariance to an unknown shortcut and can result in better finite-sample efficiency compared to commonly used regularization schemes. We empirically validate our theoretical findings by showing that TIPMI and MBM outperform several state-of-the-art methods on one language and two vision datasets.
Cite
Text
Zapzalka and Makar. "Teaching Invariance Using Privileged Mediation Information." Transactions on Machine Learning Research, 2026.Markdown
[Zapzalka and Makar. "Teaching Invariance Using Privileged Mediation Information." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/zapzalka2026tmlr-teaching/)BibTeX
@article{zapzalka2026tmlr-teaching,
title = {{Teaching Invariance Using Privileged Mediation Information}},
author = {Zapzalka, Dylan and Makar, Maggie},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/zapzalka2026tmlr-teaching/}
}