Efficient Concertormer for Image Deblurring and Beyond

Kuo, Pin-Hung; Pan, Jinshan; Chien, Shao-Yi; Yang, Ming-Hsuan

doi:10.1109/ICCV51701.2025.01361

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien, Ming-Hsuan Yang

ICCV 2025 pp. 14665-14675

doi:10.1109/ICCV51701.2025.01361 /iccv/2025/kuo2025iccv-efficient/

Abstract

The Transformer architecture has excelled in NLP and vision tasks, but its self-attention complexity grows quadratically with image size, making high-resolution tasks computationally expensive. We introduce Concertormer, featuring Concerto Self-Attention (CSA) for image deblurring. CSA splits self-attention into global and local components while retaining partial information in additional dimensions, achieving linear complexity. A Cross-Dimensional Communication module enhances expressiveness by linearly combining attention maps. Additionally, our gated-dconv MLP merges the two-staged Transformer design into a single stage. Extensive evaluations show our method performs favorably against state-of-the-art works in deblurring, deraining, and JPEG artifact removal.

PDF ICCV Semantic Scholar

Cite

Text

Kuo et al. "Efficient Concertormer for Image Deblurring and Beyond." International Conference on Computer Vision, 2025. doi:10.1109/ICCV51701.2025.01361

Markdown

[Kuo et al. "Efficient Concertormer for Image Deblurring and Beyond." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/kuo2025iccv-efficient/) doi:10.1109/ICCV51701.2025.01361

BibTeX

@inproceedings{kuo2025iccv-efficient,
  title     = {{Efficient Concertormer for Image Deblurring and Beyond}},
  author    = {Kuo, Pin-Hung and Pan, Jinshan and Chien, Shao-Yi and Yang, Ming-Hsuan},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {14665-14675},
  doi       = {10.1109/ICCV51701.2025.01361},
  url       = {https://mlanthology.org/iccv/2025/kuo2025iccv-efficient/}
}