ReF-LDM: A Latent Diffusion Model for Reference-Based Face Image Restoration
Abstract
While recent works on blind face image restoration have successfully produced impressive high-quality (HQ) images with abundant details from low-quality (LQ) input images, the generated content may not accurately reflect the real appearance of a person. To address this problem, incorporating well-shot personal images as additional reference inputs may be a promising strategy. Inspired by the recent success of the Latent Diffusion Model (LDM) in image generation, we propose ReF-LDM—an adaptation of LDM designed to generate HQ face images conditioned on one LQ image and multiple HQ reference images. Our LDM-based model incorporates an effective and efficient mechanism, CacheKV, for conditioning on reference images. Additionally, we design a timestep-scaled identity loss, enabling LDM to focus on learning the discriminating features of human faces. Lastly, we construct FFHQ-ref, a dataset consisting of 20,406 high-quality (HQ) face images with corresponding reference images, which can serve as both training and evaluation data for reference-based face restoration models.
Cite
Text
Hsiao et al. "ReF-LDM: A Latent Diffusion Model for Reference-Based Face Image Restoration." Neural Information Processing Systems, 2024. doi:10.52202/079017-2380Markdown
[Hsiao et al. "ReF-LDM: A Latent Diffusion Model for Reference-Based Face Image Restoration." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/hsiao2024neurips-refldm/) doi:10.52202/079017-2380BibTeX
@inproceedings{hsiao2024neurips-refldm,
title = {{ReF-LDM: A Latent Diffusion Model for Reference-Based Face Image Restoration}},
author = {Hsiao, Chi-Wei and Liu, Yu-Lun and Yang, Cheng-Kun and Kuo, Sheng-Po and Jou, Yucheun Kevin and Chen, Chia-Ping},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-2380},
url = {https://mlanthology.org/neurips/2024/hsiao2024neurips-refldm/}
}