A Simple Recipe for Language-Guided Domain Generalized Segmentation

Abstract

Generalization to new domains not seen during training is one of the long-standing challenges in deploying neural networks in real-world applications. Existing generalization techniques either necessitate external images for augmentation and/or aim at learning invariant representations by imposing various alignment constraints. Large-scale pretraining has recently shown promising generalization capabilities along with the potential of binding different modalities. For instance the advent of vision-language models like CLIP has opened the doorway for vision models to exploit the textual modality. In this paper we introduce a simple framework for generalizing semantic segmentation networks by employing language as the source of randomization. Our recipe comprises three key ingredients: (i) the preservation of the intrinsic CLIP robustness through minimal fine-tuning (ii) language-driven local style augmentation and (iii) randomization by locally mixing the source and augmented styles during training. Extensive experiments report state-of-the-art results on various generalization benchmarks.

Cite

Text

Fahes et al. "A Simple Recipe for Language-Guided Domain Generalized Segmentation." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02211

Markdown

[Fahes et al. "A Simple Recipe for Language-Guided Domain Generalized Segmentation." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/fahes2024cvpr-simple/) doi:10.1109/CVPR52733.2024.02211

BibTeX

@inproceedings{fahes2024cvpr-simple,
  title     = {{A Simple Recipe for Language-Guided Domain Generalized Segmentation}},
  author    = {Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and Pérez, Patrick and de Charette, Raoul},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {23428-23437},
  doi       = {10.1109/CVPR52733.2024.02211},
  url       = {https://mlanthology.org/cvpr/2024/fahes2024cvpr-simple/}
}