Stochastic Segmentation with Conditional Categorical Diffusion Models

Abstract

Semantic segmentation has made significant progress in recent years thanks to deep neural networks, but the common objective of generating a single segmentation output that accurately matches the image's content may not be suitable for safety-critical domains such as medical diagnostics and autonomous driving. Instead, multiple possible correct segmentation maps may be required to reflect the true distribution of annotation maps. In this context, stochastic semantic segmentation methods must learn to predict conditional distributions of labels given the image, but this is challenging due to the typically multimodal distributions, high-dimensional output spaces, and limited annotation data. To address these challenges, we propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models. Our model is conditioned to the input image, enabling it to generate multiple segmentation label maps that account for the aleatoric uncertainty arising from divergent ground truth annotations. Our experimental results show that CCDM achieves state-of-the-art performance on LIDC, a stochastic semantic segmentation dataset, and outperforms established baselines on the classical segmentation dataset Cityscapes.

Cite

Text

Zbinden et al. "Stochastic Segmentation with Conditional Categorical Diffusion Models." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00109

Markdown

[Zbinden et al. "Stochastic Segmentation with Conditional Categorical Diffusion Models." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zbinden2023iccv-stochastic/) doi:10.1109/ICCV51070.2023.00109

BibTeX

@inproceedings{zbinden2023iccv-stochastic,
  title     = {{Stochastic Segmentation with Conditional Categorical Diffusion Models}},
  author    = {Zbinden, Lukas and Doorenbos, Lars and Pissas, Theodoros and Huber, Adrian Thomas and Sznitman, Raphael and Márquez-Neila, Pablo},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {1119-1129},
  doi       = {10.1109/ICCV51070.2023.00109},
  url       = {https://mlanthology.org/iccv/2023/zbinden2023iccv-stochastic/}
}