Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

Abstract

Vector-quantized based models have recently demonstrated strong potential for visual prior modeling. However, existing VQ-based methods simply encode visual features with nearest codebook items and train index predictor with code-level supervision. Due to the richness of visual signal, VQ encoding often leads to large quantization error. Furthermore, training predictor with code-level supervision can not take the final reconstruction errors into consideration, result in sub-optimal prior modeling accuracy. In this paper we address the above two issues and propose a Texture Vector-Quantization and a Reconstruction Aware Prediction strategy. The texture vector-quantization strategy leverages the task character of super-resolution and only introduce codebook to model the prior of missing textures. While the reconstruction aware prediction strategy makes use of the straight-through estimator to directly train index predictor with image-level supervision. Our proposed generative SR model TVQ&RAP is able to deliver photo-realistic SR results with small computational cost.

Cite

Text

Li et al. "Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution." International Conference on Learning Representations, 2026.

Markdown

[Li et al. "Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/li2026iclr-texture/)

BibTeX

@inproceedings{li2026iclr-texture,
  title     = {{Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution}},
  author    = {Li, Qifan and Zou, Jiale and Zhang, Jinhua and Long, Wei and Zhou, Xingyu and Gu, Shuhang},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/li2026iclr-texture/}
}