Hybrid Transformer and CNN Attention Network for Stereo Image Super-Resolution
Abstract
Multi-stage strategies are frequently employed in image restoration tasks. While transformer-based methods have exhibited high efficiency in single-image super-resolution tasks, they have not yet shown significant advantages over CNN-based methods in stereo super-resolution tasks. This can be attributed to two key factors: first, current single-image super-resolution transformers are unable to leverage the complementary stereo information during the process; second, the performance of transformers is typically reliant on sufficient data, which is absent in common stereo-image super-resolution algorithms. To address these issues, we propose a Hybrid Transformer and CNN Attention Network (HTCAN), which utilizes a transformer-based network for single-image enhancement and a CNN-based network for stereo information fusion. Furthermore, we employ a multi-patch training strategy and larger window sizes to activate more input pixels for super-resolution. We also revisit other advanced techniques, such as data augmentation, data ensemble, and model ensemble to reduce overfitting and data bias. Finally, our approach achieved a score of 23.90dB and emerged as the winner in Track 1 of the NTIRE 2023 Stereo Image Super-Resolution Challenge.
Cite
Text
Cheng et al. "Hybrid Transformer and CNN Attention Network for Stereo Image Super-Resolution." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023. doi:10.1109/CVPRW59228.2023.00171Markdown
[Cheng et al. "Hybrid Transformer and CNN Attention Network for Stereo Image Super-Resolution." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.](https://mlanthology.org/cvprw/2023/cheng2023cvprw-hybrid/) doi:10.1109/CVPRW59228.2023.00171BibTeX
@inproceedings{cheng2023cvprw-hybrid,
title = {{Hybrid Transformer and CNN Attention Network for Stereo Image Super-Resolution}},
author = {Cheng, Ming and Ma, Haoyu and Ma, Qiufang and Sun, Xiaopeng and Li, Weiqi and Zhang, Zhenyu and Sheng, Xuhan and Zhao, Shijie and Li, Junlin and Zhang, Li},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2023},
pages = {1702-1711},
doi = {10.1109/CVPRW59228.2023.00171},
url = {https://mlanthology.org/cvprw/2023/cheng2023cvprw-hybrid/}
}