ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion
Abstract
Recent progress in personalizing text-to-image (T2I) diffusion models has demonstrated their capability to generate images based on personalized visual concepts using only a few user-provided examples. However, these models often struggle with maintaining high visual fidelity, particularly when modifying scenes according to textual descriptions. To address this challenge, we introduce ComFusion, an innovative approach that leverages pretrained models to create compositions of user-supplied subject images and predefined text scenes. ComFusion incorporates a class-scene prior preservation regularization, which combines subject class and scene-specific knowledge from pretrained models to enhance generation fidelity. Additionally, ComFusion uses coarse-generated images to ensure alignment with both the instance images and scene texts, thereby achieving a delicate balance between capturing the subject’s essence and maintaining scene fidelity. Extensive evaluations of ComFusion against various baselines in T2I personalization have demonstrated its qualitative and quantitative superiority.
Cite
Text
Hong et al. "ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72784-9_1Markdown
[Hong et al. "ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/hong2024eccv-comfusion/) doi:10.1007/978-3-031-72784-9_1BibTeX
@inproceedings{hong2024eccv-comfusion,
title = {{ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion}},
author = {Hong, Yan and Duan, Yuxuan and Zhang, Bo and Chen, Haoxing and Lan, Jun and Zhu, Huijia and Wang, Weiqiang and Zhang, Jianfu},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72784-9_1},
url = {https://mlanthology.org/eccv/2024/hong2024eccv-comfusion/}
}