StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-Trained StyleGAN
Abstract
One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a solution from a novel perspective that differs from existing frameworks. We first investigate the latent feature space of a pre-trained StyleGAN and discover some excellent spatial transformation properties. Upon the observation, we propose a novel unified framework based on a pre-trained StyleGAN that enables a set of powerful functionalities, i.e., high-resolution video generation, disentangled control by driving video or audio, and flexible face editing. Our framework elevates the resolution of the synthesized talking face to 1024×1024 for the first time, even though the training dataset has a lower resolution. Moreover, our framework allows two types of facial editing, i.e., global editing via GAN inversion and intuitive editing via 3D morphable models. Comprehensive experiments show superior video quality and flexible controllability over state-of-the-art methods. Code is available at https://github.com/FeiiYin/StyleHEAT.
Cite
Text
Yin et al. "StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-Trained StyleGAN." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19790-1_6Markdown
[Yin et al. "StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-Trained StyleGAN." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/yin2022eccv-styleheat/) doi:10.1007/978-3-031-19790-1_6BibTeX
@inproceedings{yin2022eccv-styleheat,
title = {{StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-Trained StyleGAN}},
author = {Yin, Fei and Zhang, Yong and Cun, Xiaodong and Cao, Mingdeng and Fan, Yanbo and Wang, Xuan and Bai, Qingyan and Wu, Baoyuan and Wang, Jue and Yang, Yujiu},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19790-1_6},
url = {https://mlanthology.org/eccv/2022/yin2022eccv-styleheat/}
}