PQ-VAE: Learning Hierarchical Discrete Representations with Progressive Quantization

Abstract

Variational auto-encoders (VAEs) are widely used in generative modeling and representation learning, with applications ranging from image generation to data compression. However, conventional VAEs face challenges in balancing the tradeoff between compactness and informativeness of the learned latent codes. In this work, we propose Progressive Quantization VAE (PQ-VAE), which aims to learn a progressive sequential structure for data representation that maximizes the mutual information between the latent representations and the original data in a limited description length. The resulting representations provide a global, compact, and hierarchical understanding of the data semantics, making it suitable for high-level tasks while achieving high compression rates. The proposed model offers an effective solution for generative modeling and data compression while enabling improved performance in high-level tasks such as image understanding and generation.

Cite

Text

Huang et al. "PQ-VAE: Learning Hierarchical Discrete Representations with Progressive Quantization." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00750

Markdown

[Huang et al. "PQ-VAE: Learning Hierarchical Discrete Representations with Progressive Quantization." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/huang2024cvprw-pqvae/) doi:10.1109/CVPRW63382.2024.00750

BibTeX

@inproceedings{huang2024cvprw-pqvae,
  title     = {{PQ-VAE: Learning Hierarchical Discrete Representations with Progressive Quantization}},
  author    = {Huang, Lun and Qiu, Qiang and Sapiro, Guillermo},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {7550-7558},
  doi       = {10.1109/CVPRW63382.2024.00750},
  url       = {https://mlanthology.org/cvprw/2024/huang2024cvprw-pqvae/}
}