Self-Supervised Variable Rate Image Compression Using Visual Attention

Abstract

The recent success of self-supervised learning relies on its ability to learn the representations from self-defined pseudo-labels that are applied to several downstream tasks. Motivated by this ability, we present a deep image compression technique, which learns the lossy reconstruction of raw images from the self-supervised learned representation of SimCLR ResNet-50 architecture. Our framework uses a feature pyramid to achieve the variable rate compression of the image using a self-attention map for the optimal allocation of bits. The paper provides an overview to observe the effects of contrastive self-supervised representations and the self-attention map on the distortion and perceptual quality of the reconstructed image. The experiments are performed on a different class of images to show that the proposed method outperforms the other variable rate deep compression models without compromising the perceptual quality of the images.

Cite

Text

Sinha et al. "Self-Supervised Variable Rate Image Compression Using Visual Attention." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00179

Markdown

[Sinha et al. "Self-Supervised Variable Rate Image Compression Using Visual Attention." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/sinha2022cvprw-selfsupervised/) doi:10.1109/CVPRW56347.2022.00179

BibTeX

@inproceedings{sinha2022cvprw-selfsupervised,
  title     = {{Self-Supervised Variable Rate Image Compression Using Visual Attention}},
  author    = {Sinha, Abhishek Kumar and Moorthi, S. Manthira and Dhar, Debajyoti},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {1720-1724},
  doi       = {10.1109/CVPRW56347.2022.00179},
  url       = {https://mlanthology.org/cvprw/2022/sinha2022cvprw-selfsupervised/}
}