CAMixerSR: Only Details Need More "Attention"

Abstract

To satisfy the rapidly increasing demands on the large image (2K-8K) super-resolution (SR) prevailing methods follow two independent tracks: 1) accelerate existing networks by content-aware routing and 2) design better super-resolution networks via token mixer refining. Despite directness they encounter unavoidable defects (e.g. inflexible route or non-discriminative processing) limiting further improvements of quality-complexity trade-off. To erase the drawbacks we integrate these schemes by proposing a content-aware mixer (CAMixer) which assigns convolution for simple contexts and additional deformable window-attention for sparse textures. Specifically the CAMixer uses a learnable predictor to generate multiple bootstraps including offsets for windows warping a mask for classifying windows and convolutional attentions for endowing convolution with the dynamic property which modulates attention to include more useful textures self-adaptively and improves the representation capability of convolution. We further introduce a global classification loss to improve the accuracy of predictors. By simply stacking CAMixers we obtain CAMixerSR which achieves superior performance on large-image SR lightweight SR and omnidirectional-image SR.

Cite

Text

Wang et al. "CAMixerSR: Only Details Need More "Attention"." Conference on Computer Vision and Pattern Recognition, 2024.

Markdown

[Wang et al. "CAMixerSR: Only Details Need More "Attention"." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/wang2024cvpr-camixersr/)

BibTeX

@inproceedings{wang2024cvpr-camixersr,
  title     = {{CAMixerSR: Only Details Need More "Attention"}},
  author    = {Wang, Yan and Liu, Yi and Zhao, Shijie and Li, Junlin and Zhang, Li},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {25837-25846},
  url       = {https://mlanthology.org/cvpr/2024/wang2024cvpr-camixersr/}
}