Simpler Is Better: Few-Shot Semantic Segmentation with Classifier Weight Transformer

Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

ICCV 2021 pp. 8741-8750

doi:10.1109/ICCV48922.2021.00862 /iccv/2021/lu2021iccv-simpler/

Abstract

A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels). Most existing methods meta-learn all three model components for fast adaptation to a new class. However, given that as few as a single support set image is available, effective model adaption of all three components to the new class is extremely challenging. In this work we propose to simplify the meta-learning task by focusing solely on the simplest component -- the classifier, whilst leaving the encoder and decoder to pre-training. We hypothesize that if we pre-train an off-the-shelf segmentation model over a set of diverse training classes with sufficient annotations, the encoder and decoder can capture rich discriminative features applicable for any unseen classes, rendering the subsequent meta-learning stage unnecessary. For the classifier meta-learning, we introduce a Classifier Weight Transformer (CWT) designed to dynamically adapt the support-set trained classifier's weights to each query image in an inductive way. Extensive experiments on two standard benchmarks show that despite its simplicity, our method outperforms the state-of-the-art alternatives, often by a large margin. Code is available on https://github.com/zhiheLu/CWT-for-FSS.

PDF ICCV Semantic Scholar

Cite

Text

Lu et al. "Simpler Is Better: Few-Shot Semantic Segmentation with Classifier Weight Transformer." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00862

Markdown

[Lu et al. "Simpler Is Better: Few-Shot Semantic Segmentation with Classifier Weight Transformer." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/lu2021iccv-simpler/) doi:10.1109/ICCV48922.2021.00862

BibTeX

@inproceedings{lu2021iccv-simpler,
  title     = {{Simpler Is Better: Few-Shot Semantic Segmentation with Classifier Weight Transformer}},
  author    = {Lu, Zhihe and He, Sen and Zhu, Xiatian and Zhang, Li and Song, Yi-Zhe and Xiang, Tao},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {8741-8750},
  doi       = {10.1109/ICCV48922.2021.00862},
  url       = {https://mlanthology.org/iccv/2021/lu2021iccv-simpler/}
}