Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion

Abstract

Most existing diffusion models have primarily utilized reference images for image-to-image translation rather than for super-resolution (SR). In SR-specific tasks, diffusion methods rely solely on low-resolution (LR) inputs, limiting their ability to leverage reference information. Prior reference-based diffusion SR methods have shown that incorporating appropriate references can significantly enhance reconstruction quality; however, identifying suitable references in real-world scenarios remains a critical challenge. Recently, Retrieval-Augmented Generation (RAG) has emerged as an effective framework that integrates retrieval-based and generation-based information from databases to enhance the accuracy and relevance of responses. Inspired by RAG, we propose an image-based RAG framework (iRAG) for realistic super-resolution, which employs a trainable hashing function to retrieve either real-world or generated references given an LR query. Retrieved patches are passed to a restoration module that generates high-fidelity super-resolved features, and a hallucination filtering mechanism is used to refine generated references from pre-trained diffusion models. Experimental results demonstrate that our approach not only resolves practical difficulties in reference selection but also delivers superior performance over existing diffusion and non-diffusion RefSR methods. Code is available at https://github.com/ByeonghunLee12/iRAG.

Cite

Text

Lee et al. "Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion." International Conference on Computer Vision, 2025.

Markdown

[Lee et al. "Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/lee2025iccv-referencebased/)

BibTeX

@inproceedings{lee2025iccv-referencebased,
  title     = {{Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion}},
  author    = {Lee, Byeonghun and Cho, Hyunmin and Choi, Hong Gyu and Kang, Soo Min and Ahn, Iljun and Jin, Kyong Hwan},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {10764-10774},
  url       = {https://mlanthology.org/iccv/2025/lee2025iccv-referencebased/}
}