Evaluating Systemic Error Detection Methods Using Synthetic Images

Abstract

We introduce SpotCheck, a framework for generating synthetic datasets to use for evaluating methods for discovering blindspots (i.e., systemic errors) in image classifiers. We use SpotCheck to run controlled studies of how various factors influence the performance of blindspot discovery methods. Our experiments reveal several shortcomings of existing methods, such as relatively poor performance in settings with multiple blindspots and sensitivity to hyperparameters. Further, we find that a method based on dimensionality reduction, PlaneSpot, is competitive with existing methods, which has promising implications for the development of interactive tools.

Cite

Text

Plumb et al. "Evaluating Systemic Error Detection Methods Using Synthetic Images." ICML 2022 Workshops: SCIS, 2022.

Markdown

[Plumb et al. "Evaluating Systemic Error Detection Methods Using Synthetic Images." ICML 2022 Workshops: SCIS, 2022.](https://mlanthology.org/icmlw/2022/plumb2022icmlw-evaluating/)

BibTeX

@inproceedings{plumb2022icmlw-evaluating,
  title     = {{Evaluating Systemic Error Detection Methods Using Synthetic Images}},
  author    = {Plumb, Gregory and Johnson, Nari and Cabrera, Ángel and Ribeiro, Marco Tulio and Talwalkar, Ameet},
  booktitle = {ICML 2022 Workshops: SCIS},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/plumb2022icmlw-evaluating/}
}