Let the Avatar Talk Using Texts Without Paired Training Data

Abstract

This paper introduces text-driven talking avatar generation, a task that uses text to instruct both the generation and animation of an avatar. One significant obstacle in this task is the absence of paired text and talking avatar data for model training, limiting data-driven methodologies. To this end, we present a zero-shot approach that adapts an existing 3D-aware image generation model, trained on a large-scale image dataset for high-quality avatar creation, to align with textual instructions and be animated to produce talking avatars, eliminating the need for paired text and talking avatar data. Our approach’s core lies in the seamless integration of a 3D-aware image generation model (i.e., EG3D), the explicit 3DMM model, and a newly developed self-supervised inpainting technique, to create and animate the avatar and generate a temporal consistent talking video. Thorough evaluations demonstrate the effectiveness of our proposed approach in generating realistic avatars based on textual descriptions and empowering avatars to express user-specified text. Notably, our approach is highly controllable and can generate rich expressions and head poses.

Cite

Text

Wu et al. "Let the Avatar Talk Using Texts Without Paired Training Data." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73223-2_23

Markdown

[Wu et al. "Let the Avatar Talk Using Texts Without Paired Training Data." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/wu2024eccv-let/) doi:10.1007/978-3-031-73223-2_23

BibTeX

@inproceedings{wu2024eccv-let,
  title     = {{Let the Avatar Talk Using Texts Without Paired Training Data}},
  author    = {Wu, Xiuzhe and Sun, Yang-Tian and Chen, Handi and Zhou, Hang and Wang, Jingdong and Liu, Zhengzhe and Qi, Xiaojuan},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73223-2_23},
  url       = {https://mlanthology.org/eccv/2024/wu2024eccv-let/}
}