Dual Aggregation Transformer for Image Super-Resolution

Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang, Fisher Yu

ICCV 2023 pp. 12312-12321

doi:10.1109/ICCV51070.2023.01131 /iccv/2023/chen2023iccv-dual/

Abstract

Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods. Code and models are obtainable at https://github.com/zhengchen1999/DAT.

PDF ICCV Semantic Scholar

Cite

Text

Chen et al. "Dual Aggregation Transformer for Image Super-Resolution." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01131

Markdown

[Chen et al. "Dual Aggregation Transformer for Image Super-Resolution." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/chen2023iccv-dual/) doi:10.1109/ICCV51070.2023.01131

BibTeX

@inproceedings{chen2023iccv-dual,
  title     = {{Dual Aggregation Transformer for Image Super-Resolution}},
  author    = {Chen, Zheng and Zhang, Yulun and Gu, Jinjin and Kong, Linghe and Yang, Xiaokang and Yu, Fisher},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {12312-12321},
  doi       = {10.1109/ICCV51070.2023.01131},
  url       = {https://mlanthology.org/iccv/2023/chen2023iccv-dual/}
}