Linear Attention Modeling for Learned Image Compression

Donghui Feng, Zhengxue Cheng, Shen Wang, Ronghua Wu, Hongwei Hu, Guo Lu, Li Song

CVPR 2025 pp. 7623-7632

doi:10.1109/CVPR52734.2025.00714 /cvpr/2025/feng2025cvpr-linear/

Abstract

Recent years, learned image compression has made tremendous progress to achieve impressive coding efficiency. Its coding gain mainly comes from non-linear neural network-based transform and learnable entropy modeling. However, most studies focus on a strong backbone, and few studies consider a low complexity design. In this paper, we propose LALIC, a linear attention modeling for learned image compression. Specially, we propose to use Bi-RWKV blocks, by utilizing the Spatial Mix and Channel Mix modules to achieve more compact feature extraction, and apply the Conv based Omni-Shift module to adapt to two-dimensional latent representation. Furthermore, we propose a RWKV-based Spatial-Channel ConTeXt model (RWKV-SCCTX), that leverages the Bi-RWKV to modeling the correlation between neighboring features effectively. To our knowledge, our work is the first work to utilize efficient Bi-RWKV models with linear attention for learned image compression. Experimental results demonstrate that our method achieves competitive RD performances by outperforming VTM-9.1 by -15.26%, -15.41%, -17.63% in BD-rate on Kodak, CLIC and Tecnick datasets. The code is available at https://github.com/sjtu-medialab/RwkvCompress .

PDF CVPR Semantic Scholar

Cite

Text

Feng et al. "Linear Attention Modeling for Learned Image Compression." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00714

Markdown

[Feng et al. "Linear Attention Modeling for Learned Image Compression." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/feng2025cvpr-linear/) doi:10.1109/CVPR52734.2025.00714

BibTeX

@inproceedings{feng2025cvpr-linear,
  title     = {{Linear Attention Modeling for Learned Image Compression}},
  author    = {Feng, Donghui and Cheng, Zhengxue and Wang, Shen and Wu, Ronghua and Hu, Hongwei and Lu, Guo and Song, Li},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {7623-7632},
  doi       = {10.1109/CVPR52734.2025.00714},
  url       = {https://mlanthology.org/cvpr/2025/feng2025cvpr-linear/}
}