SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation

Abstract

Semi-supervised learning improves data efficiency of deep models by leveraging unlabeled samples to alleviate the reliance on a large set of labeled samples. These successes concentrate on the pixel-wise consistency by using convolutional neural networks (CNNs) but fail to address both global learning capability and class-level features for unlabeled data. Recent works raise a new trend that Trans- former achieves superior performance on the entire feature map in various tasks. In this paper, we unify the current dominant Mean-Teacher approaches by reconciling intra- model and inter-model properties for semi-supervised segmentation to produce a novel algorithm, SemiCVT, that absorbs the quintessence of CNNs and Transformer in a comprehensive way. Specifically, we first design a parallel CNN-Transformer architecture (CVT) with introducing an intra-model local-global interaction schema (LGI) in Fourier domain for full integration. The inter-model class- wise consistency is further presented to complement the class-level statistics of CNNs and Transformer in a cross- teaching manner. Extensive empirical evidence shows that SemiCVT yields consistent improvements over the state-of- the-art methods in two public benchmarks.

Cite

Text

Huang et al. "SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01091

Markdown

[Huang et al. "SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/huang2023cvpr-semicvt/) doi:10.1109/CVPR52729.2023.01091

BibTeX

@inproceedings{huang2023cvpr-semicvt,
  title     = {{SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation}},
  author    = {Huang, Huimin and Xie, Shiao and Lin, Lanfen and Tong, Ruofeng and Chen, Yen-Wei and Li, Yuexiang and Wang, Hong and Huang, Yawen and Zheng, Yefeng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {11340-11349},
  doi       = {10.1109/CVPR52729.2023.01091},
  url       = {https://mlanthology.org/cvpr/2023/huang2023cvpr-semicvt/}
}