"Why Did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts
Abstract
Performance of machine learning models may differ significantly in novel environments compared to during training due to shifts in the underlying data distribution. Attributing performance changes to specific data shifts is critical for identifying sources of model failures and designing stable models. In this work, we design a novel method for attributing performance difference between environments to shifts in the underlying causal mechanisms. To this end, we construct a cooperative game where the contribution of each mechanism is quantified as their Shapley value. We demonstrate the ability of the method to identify sources of spurious correlation and attribute performance drop to shifts in label and/or feature distributions on synthetic and real-world datasets.
Cite
Text
Zhang et al. ""Why Did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts." ICML 2022 Workshops: SCIS, 2022.Markdown
[Zhang et al. ""Why Did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts." ICML 2022 Workshops: SCIS, 2022.](https://mlanthology.org/icmlw/2022/zhang2022icmlw-model/)BibTeX
@inproceedings{zhang2022icmlw-model,
title = {{"Why Did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts}},
author = {Zhang, Haoran and Singh, Harvineet and Joshi, Shalmali},
booktitle = {ICML 2022 Workshops: SCIS},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/zhang2022icmlw-model/}
}