Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion
Abstract
Most existing diffusion models have primarily utilized reference images for image-to-image translation rather than for super-resolution (SR). In SR-specific tasks, diffusion methods rely solely on low-resolution (LR) inputs, limiting their ability to leverage reference information. Prior reference-based diffusion SR methods have shown that incorporating appropriate references can significantly enhance reconstruction quality; however, identifying suitable references in real-world scenarios remains a critical challenge. Recently, Retrieval-Augmented Generation (RAG) has emerged as an effective framework that integrates retrieval-based and generation-based information from databases to enhance the accuracy and relevance of responses. Inspired by RAG, we propose an image-based RAG framework (iRAG) for realistic super-resolution, which employs a trainable hashing function to retrieve either real-world or generated references given an LR query. Retrieved patches are passed to a restoration module that generates high-fidelity super-resolved features, and a hallucination filtering mechanism is used to refine generated references from pre-trained diffusion models. Experimental results demonstrate that our approach not only resolves practical difficulties in reference selection but also delivers superior performance over existing diffusion and non-diffusion RefSR methods. Code is available at https://github.com/ByeonghunLee12/iRAG.
Cite
Text
Lee et al. "Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion." International Conference on Computer Vision, 2025.Markdown
[Lee et al. "Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/lee2025iccv-referencebased/)BibTeX
@inproceedings{lee2025iccv-referencebased,
title = {{Reference-Based Super-Resolution via Image-Based Retrieval-Augmented Generation Diffusion}},
author = {Lee, Byeonghun and Cho, Hyunmin and Choi, Hong Gyu and Kang, Soo Min and Ahn, Iljun and Jin, Kyong Hwan},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {10764-10774},
url = {https://mlanthology.org/iccv/2025/lee2025iccv-referencebased/}
}