Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution

Abstract

Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-range multi-scale dependencies captured by transformers. Specifically, our network comprises transformer and convolutional branches, which synergetically complement each representation during the restoration procedure. Furthermore, we propose a cross-scale token attention module, allowing the transformer branch to exploit the informative relationships among tokens across different scales efficiently. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.

Cite

Text

Yoo et al. "Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Yoo et al. "Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/yoo2023wacv-enriched/)

BibTeX

@inproceedings{yoo2023wacv-enriched,
  title     = {{Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution}},
  author    = {Yoo, Jinsu and Kim, Taehoon and Lee, Sihaeng and Kim, Seung Hwan and Lee, Honglak and Kim, Tae Hyun},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {4956-4965},
  url       = {https://mlanthology.org/wacv/2023/yoo2023wacv-enriched/}
}