Finding and Fixing Spurious Patterns with Explanations
Abstract
Image classifiers often use spurious patterns, such as “relying on the presence of a person to detect a tennis racket,” which do not generalize. In this work, we present an end-to-end pipeline for identifying and mitigating spurious patterns for such models, under the assumption that we have access to pixel-wise object-annotations. We start by identifying patterns such as “the model’s prediction for tennis racket changes 63% of the time if we hide the people.” Then, if a pattern is spurious, we mitigate it via a novel form of data augmentation. We demonstrate that our method identifies a diverse set of spurious patterns and that it mitigates them by producing a model that is both more accurate on a distribution where the spurious pattern is not helpful and more robust to distribution shift.
Cite
Text
Plumb et al. "Finding and Fixing Spurious Patterns with Explanations." Transactions on Machine Learning Research, 2022.Markdown
[Plumb et al. "Finding and Fixing Spurious Patterns with Explanations." Transactions on Machine Learning Research, 2022.](https://mlanthology.org/tmlr/2022/plumb2022tmlr-finding/)BibTeX
@article{plumb2022tmlr-finding,
title = {{Finding and Fixing Spurious Patterns with Explanations}},
author = {Plumb, Gregory and Ribeiro, Marco Tulio and Talwalkar, Ameet},
journal = {Transactions on Machine Learning Research},
year = {2022},
url = {https://mlanthology.org/tmlr/2022/plumb2022tmlr-finding/}
}