In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

ECCV 2024

doi:10.1007/978-3-031-72940-9_9 /eccv/2024/kang2024eccv-defense/

Abstract

We present Lazy Visual Grounding for open-vocabulary semantic segmentation, which decouples unsupervised object mask discovery from object grounding. Plenty of the previous art casts this task as pixel-to-text classification without object-level comprehension, leveraging the image-to-text classification capability of pretrained vision-and-language models. We argue that visual objects are distinguishable without the prior text information as segmentation is essentially a visual understanding task. Lazy visual grounding first discovers object masks covering an image with iterative Normalized cuts and then later assigns text on the discovered objects in a late interaction manner. Our model requires no additional training yet shows great performance on five public datasets: Pascal VOC, Pascal Context, COCO-object, COCO-stuff, and ADE 20K. Especially, the visually appealing segmentation results demonstrate the model capability to localize objects precisely.

PDF ECCV Semantic Scholar

Cite

Text

Kang and Cho. "In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72940-9_9

Markdown

[Kang and Cho. "In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/kang2024eccv-defense/) doi:10.1007/978-3-031-72940-9_9

BibTeX

@inproceedings{kang2024eccv-defense,
  title     = {{In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation}},
  author    = {Kang, Dahyun and Cho, Minsu},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72940-9_9},
  url       = {https://mlanthology.org/eccv/2024/kang2024eccv-defense/}
}