ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model
Abstract
Diffusion models have achieved great success in dual-conditioned image generation. However they still face significant challenges in image-guided sketch colorization where reference and sketch images usually exhibit different spatial structures and semantics. This mismatch termed "distribution shift" in this paper results in various artifacts and degrades the colorization quality. To address this issue we conducted thorough investigations into the image-prompted latent diffusion model and developed a two-stage training framework to mitigate the effects of distribution shift based on our analysis. Comprehensive quantitative comparisons qualitative evaluations and user studies were performed to demonstrate the superiority of our proposed methods. Additionally an ablation study was conducted to assess the impact of the distribution shift and the selection of reference embeddings. Codes are made publicly available at https://github.com/tellurionkanata/colorizeDiffusion.
Cite
Text
Yan et al. "ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model." Winter Conference on Applications of Computer Vision, 2025.Markdown
[Yan et al. "ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/yan2025wacv-colorizediffusion/)BibTeX
@inproceedings{yan2025wacv-colorizediffusion,
title = {{ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model}},
author = {Yan, Dingkun and Yuan, Liang and Wu, Erwin and Nishioka, Yuma and Fujishiro, Issei and Saito, Suguru},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2025},
pages = {5092-5102},
url = {https://mlanthology.org/wacv/2025/yan2025wacv-colorizediffusion/}
}