MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation

Wei, Yuxiang; Ji, Zhilong; Bai, Jinfeng; Zhang, Hongzhi; Zhang, Lei; Zuo, Wangmeng

doi:10.1007/978-3-031-72983-6_15

MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation

Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo

ECCV 2024

doi:10.1007/978-3-031-72983-6_15 /eccv/2024/wei2024eccv-masterweaver/

Abstract

Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they often suffer from overfitting issues. The learned identity tends to entangle with irrelevant information, resulting in unsatisfied text controllability, especially on faces. In this work, we present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both high identity fidelity and flexible editability. Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additionally introduced cross attention. To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model. Additionally, a face-augmented dataset is constructed to facilitate disentangled identity learning, and further improve the editability. Extensive experiments demonstrate that our MasterWeaver can not only generate personalized images with faithful identity, but also exhibit superiority in text controllability. Our code can be found at https://github. com/csyxwei/MasterWeaver.

PDF ECCV Semantic Scholar

Cite

Text

Wei et al. "MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72983-6_15

Markdown

[Wei et al. "MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/wei2024eccv-masterweaver/) doi:10.1007/978-3-031-72983-6_15

BibTeX

@inproceedings{wei2024eccv-masterweaver,
  title     = {{MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation}},
  author    = {Wei, Yuxiang and Ji, Zhilong and Bai, Jinfeng and Zhang, Hongzhi and Zhang, Lei and Zuo, Wangmeng},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72983-6_15},
  url       = {https://mlanthology.org/eccv/2024/wei2024eccv-masterweaver/}
}