Variational Image Compression with a Scale Hyperprior

Abstract

We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to side information, a concept universal to virtually all modern image codecs, but largely unexplored in image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, our model trains a complex prior jointly with the underlying autoencoder. We demonstrate that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate--distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR). Furthermore, we provide a qualitative comparison of models trained for different distortion metrics.

Cite

Text

Ballé et al. "Variational Image Compression with a Scale Hyperprior." International Conference on Learning Representations, 2018.

Markdown

[Ballé et al. "Variational Image Compression with a Scale Hyperprior." International Conference on Learning Representations, 2018.](https://mlanthology.org/iclr/2018/balle2018iclr-variational/)

BibTeX

@inproceedings{balle2018iclr-variational,
  title     = {{Variational Image Compression with a Scale Hyperprior}},
  author    = {Ballé, Johannes and Minnen, David and Singh, Saurabh and Hwang, Sung Jin and Johnston, Nick},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  url       = {https://mlanthology.org/iclr/2018/balle2018iclr-variational/}
}