Hypercorrelation Squeeze for Few-Shot Segmentation

Abstract

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class. This challenging task requires to understand diverse levels of visual cues and analyze fine-grained correspondence relations between the query and the support images. To address the problem, we propose Hypercorrelation Squeeze Networks (HSNet) that leverages multi-level feature correlation and efficient 4D convolutions. It extracts diverse features from different levels of intermediate convolutional layers and constructs a collection of 4D correlation tensors, i.e., hypercorrelations. Using efficient center-pivot 4D convolutions in a pyramidal architecture, the method gradually squeezes high-level semantic and low-level geometric cues of the hypercorrelation into precise segmentation masks in coarse-to-fine manner. The significant performance improvements on standard few-shot segmentation benchmarks of PASCAL-5i, COCO-20i, and FSS-1000 verify the efficacy of the proposed method.

Cite

Text

Min et al. "Hypercorrelation Squeeze for Few-Shot Segmentation." International Conference on Computer Vision, 2021.

Markdown

[Min et al. "Hypercorrelation Squeeze for Few-Shot Segmentation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/min2021iccv-hypercorrelation/)

BibTeX

@inproceedings{min2021iccv-hypercorrelation,
  title     = {{Hypercorrelation Squeeze for Few-Shot Segmentation}},
  author    = {Min, Juhong and Kang, Dahyun and Cho, Minsu},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {6941-6952},
  url       = {https://mlanthology.org/iccv/2021/min2021iccv-hypercorrelation/}
}