Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter
Abstract
In this work, we present Patch-Adapter, an effective framework for high-resolution text-guided image inpainting. Unlike existing methods limited to lower resolutions, our approach achieves 4K+ resolution while maintaining precise content consistency and prompt alignment--two critical challenges in image inpainting that intensify with increasing resolution and texture complexity.Patch-Adapter leverages a two-stage adapter architecture to scale the Diffusion models's resolution from 1K to 4K+ without requiring structural overhauls:(1)Dual Context Adapter: Learns coherence between masked and unmasked regions at reduced resolutions to establish global structural consistency.(2)Reference Patch Adapter: Implements a patch-level attention mechanism for full-resolution inpainting, preserving local detail fidelity through adaptive feature fusion.This dual-stage architecture uniquely addresses the scalability gap in high-resolution inpainting by decoupling global semantics from localized refinement. Experiments demonstrate that Patch-Adapter not only resolves artifacts common in large-scale inpainting but also achieves state-of-the-art performance on the OpenImages and photo-concept-bucket datasets, outperforming existing methods in both perceptual quality and text-prompt adherence. The code is available at: https://github.com/Roveer/Patch-Based-Adapter
Cite
Text
Zhang et al. "Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter." International Conference on Computer Vision, 2025.Markdown
[Zhang et al. "Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/zhang2025iccv-ultra/)BibTeX
@inproceedings{zhang2025iccv-ultra,
title = {{Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter}},
author = {Zhang, Jianhui and Cheng, Shen and Sun, Qirui and Liu, Jia and Luyang, Wang and Feng, Chaoyu and Fang, Chen and Lei, Lei and Wang, Jue and Liu, Shuaicheng},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {16991-17000},
url = {https://mlanthology.org/iccv/2025/zhang2025iccv-ultra/}
}