Performance Prediction for Semantic Segmentation by a Self-Supervised Image Reconstruction Decoder

Abstract

In supervised learning, a deep neural network’s performance is measured using ground truth data. In semantic segmentation, ground truth data is sparse, requires an expensive annotation process, and, most importantly, it is not available during online operation. To tackle this problem, recent works propose various forms of performance prediction. However, they either rely on inference data histograms, additional sensors, or additional training data. In this paper, we propose a novel per-image performance prediction for semantic segmentation, with (i) no need for additional sensors (sensor efficiency), (ii) no need for additional training data (data efficiency), and (iii) no need for a dedicated retraining of the semantic segmentation (training efficiency). Specifically, we extend an already trained semantic segmentation network having fixed parameters with an image reconstruction decoder. After training and a subsequent regression, the image reconstruction quality is evaluated to predict the semantic segmentation performance. We demonstrate our method’s effectiveness with a new state-of-the-art benchmark both on KITTI and Cityscapes for image-only input methods, on Cityscapes even excelling a LiDAR-supported benchmark.

Cite

Text

Bär et al. "Performance Prediction for Semantic Segmentation by a Self-Supervised Image Reconstruction Decoder." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00486

Markdown

[Bär et al. "Performance Prediction for Semantic Segmentation by a Self-Supervised Image Reconstruction Decoder." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/bar2022cvprw-performance/) doi:10.1109/CVPRW56347.2022.00486

BibTeX

@inproceedings{bar2022cvprw-performance,
  title     = {{Performance Prediction for Semantic Segmentation by a Self-Supervised Image Reconstruction Decoder}},
  author    = {Bär, Andreas and Klingner, Marvin and Löhdefink, Jonas and Hüger, Fabian and Schlicht, Peter and Fingscheidt, Tim},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {4398-4407},
  doi       = {10.1109/CVPRW56347.2022.00486},
  url       = {https://mlanthology.org/cvprw/2022/bar2022cvprw-performance/}
}