NamedMask: Distilling Segmenters from Complementary Foundation Models
Abstract
The goal of this work is to segment and name regions of images without access to pixel-level labels during training. To tackle this task, we construct segmenters by distilling the complementary strengths of two foundation models. The first, CLIP [26], exhibits the ability to assign names to image content but lacks an accessible representation of object structure. The second, DINO [5], captures the spatial extent of objects but has no knowledge of object names. Our method, termed NamedMask, begins by using CLIP to construct category-specific archives of images. These images are pseudo-labelled with a category-agnostic salient object detector bootstrapped from DINO, then refined by category-specific segmenters using the CLIP archive labels. Thanks to the high quality of the refined masks, we show that a standard segmentation architecture trained on these archives with appropriate data augmentation achieves impressive semantic segmentation abilities for both single-object and multi-object images. As a result, our proposed NamedMask performs favourably against a range of prior work on five benchmarks including the VOC2012, COCO and large-scale ImageNet-S datasets.
Cite
Text
Shin et al. "NamedMask: Distilling Segmenters from Complementary Foundation Models." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023. doi:10.1109/CVPRW59228.2023.00524Markdown
[Shin et al. "NamedMask: Distilling Segmenters from Complementary Foundation Models." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.](https://mlanthology.org/cvprw/2023/shin2023cvprw-namedmask/) doi:10.1109/CVPRW59228.2023.00524BibTeX
@inproceedings{shin2023cvprw-namedmask,
title = {{NamedMask: Distilling Segmenters from Complementary Foundation Models}},
author = {Shin, Gyungin and Xie, Weidi and Albanie, Samuel},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2023},
pages = {4961-4970},
doi = {10.1109/CVPRW59228.2023.00524},
url = {https://mlanthology.org/cvprw/2023/shin2023cvprw-namedmask/}
}