SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation
Abstract
Semi-supervised learning improves data efficiency of deep models by leveraging unlabeled samples to alleviate the reliance on a large set of labeled samples. These successes concentrate on the pixel-wise consistency by using convolutional neural networks (CNNs) but fail to address both global learning capability and class-level features for unlabeled data. Recent works raise a new trend that Trans- former achieves superior performance on the entire feature map in various tasks. In this paper, we unify the current dominant Mean-Teacher approaches by reconciling intra- model and inter-model properties for semi-supervised segmentation to produce a novel algorithm, SemiCVT, that absorbs the quintessence of CNNs and Transformer in a comprehensive way. Specifically, we first design a parallel CNN-Transformer architecture (CVT) with introducing an intra-model local-global interaction schema (LGI) in Fourier domain for full integration. The inter-model class- wise consistency is further presented to complement the class-level statistics of CNNs and Transformer in a cross- teaching manner. Extensive empirical evidence shows that SemiCVT yields consistent improvements over the state-of- the-art methods in two public benchmarks.
Cite
Text
Huang et al. "SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01091Markdown
[Huang et al. "SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/huang2023cvpr-semicvt/) doi:10.1109/CVPR52729.2023.01091BibTeX
@inproceedings{huang2023cvpr-semicvt,
title = {{SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation}},
author = {Huang, Huimin and Xie, Shiao and Lin, Lanfen and Tong, Ruofeng and Chen, Yen-Wei and Li, Yuexiang and Wang, Hong and Huang, Yawen and Zheng, Yefeng},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2023},
pages = {11340-11349},
doi = {10.1109/CVPR52729.2023.01091},
url = {https://mlanthology.org/cvpr/2023/huang2023cvpr-semicvt/}
}