PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation

Abstract

We introduce PersonaHOI, a training- and tuning-free framework that fuses a general StableDiffusion model with a personalized face diffusion (PFD) model to generate identity-consistent human-object interaction (HOI) images. While existing PFD models have advanced significantly, they often overemphasize facial features at the expense of full-body coherence, PersonaHOI introduces an additional StableDiffusion (SD) branch guided by HOI-oriented text inputs. By incorporating cross-attention constraints in the PFD branch and spatial merging at both latent and residual levels, PersonaHOI preserves personalized facial details while ensuring interactive non-facial regions. Experiments, validated by a novel interaction alignment metric, demonstrate the superior realism and scalability of PersonaHOI, establishing a new standard for practical personalized face with HOI generation. Code is available at https://github.com/JoyHuYY1412/PersonaHOI.

Cite

Text

Hu et al. "PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02214

Markdown

[Hu et al. "PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/hu2025cvpr-personahoi/) doi:10.1109/CVPR52734.2025.02214

BibTeX

@inproceedings{hu2025cvpr-personahoi,
  title     = {{PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation}},
  author    = {Hu, Xinting and Wang, Haoran and Lenssen, Jan Eric and Schiele, Bernt},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {23775-23784},
  doi       = {10.1109/CVPR52734.2025.02214},
  url       = {https://mlanthology.org/cvpr/2025/hu2025cvpr-personahoi/}
}