What Could Go Wrong? Discovering and Describing Failure Modes in Computer Vision
Abstract
In this work, we propose a simple yet effective solution to predict and describe via natural language potential failure modes of computer vision models. Given a pretrained model and a set of samples, our aim is to find sentences that accurately describe the visual conditions in which the model under-performs. In order to study this important topic and foster future research on it, we formalize the problem of Language-Based Error Explainability (LBEE) and propose a set of metrics to evaluate and compare different methods for this task. We propose solutions that operate in a joint vision-and-language embedding space, and can characterize through language descriptions model failures caused, e.g. , by objects unseen during training or adverse visual conditions.
Cite
Text
Csurka et al. "What Could Go Wrong? Discovering and Describing Failure Modes in Computer Vision." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-92648-8_12Markdown
[Csurka et al. "What Could Go Wrong? Discovering and Describing Failure Modes in Computer Vision." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/csurka2024eccvw-go/) doi:10.1007/978-3-031-92648-8_12BibTeX
@inproceedings{csurka2024eccvw-go,
title = {{What Could Go Wrong? Discovering and Describing Failure Modes in Computer Vision}},
author = {Csurka, Gabriela and Hayes, Tyler L. and Larlus, Diane and Volpi, Riccardo},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {183-199},
doi = {10.1007/978-3-031-92648-8_12},
url = {https://mlanthology.org/eccvw/2024/csurka2024eccvw-go/}
}