CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Abstract
Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.
Cite
Text
Xing et al. "CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01229Markdown
[Xing et al. "CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/xing2023cvpr-codetalker/) doi:10.1109/CVPR52729.2023.01229BibTeX
@inproceedings{xing2023cvpr-codetalker,
title = {{CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior}},
author = {Xing, Jinbo and Xia, Menghan and Zhang, Yuechen and Cun, Xiaodong and Wang, Jue and Wong, Tien-Tsin},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2023},
pages = {12780-12790},
doi = {10.1109/CVPR52729.2023.01229},
url = {https://mlanthology.org/cvpr/2023/xing2023cvpr-codetalker/}
}