ToonTalker: Cross-Domain Face Reenactment

Abstract

We target cross-domain face reenactment in this paper, i.e., driving a cartoon image with the video of a real person and vice versa. Recently, many works have focused on one-shot talking face generation to drive a portrait with a real video, i.e., within-domain reenactment. Straightforwardly applying those methods to cross-domain animation will cause inaccurate expression transfer, blur effects, and even apparent artifacts due to the domain shift between cartoon and real faces. Only a few works attempt to settle cross-domain face reenactment. The most related work AnimeCeleb requires constructing a dataset with pose vector and cartoon image pairs by animating 3D characters, which makes it inapplicable anymore if no paired data is available. In this paper, we propose a novel method for cross-domain reenactment without paired data. Specifically, we propose a transformer-based framework to align the motions from different domains into a common latent space where motion transfer is conducted via latent code addition. Two domain-specific motion encoders and two learnable motion base memories are used to capture domain properties. A source query transformer and a driving one are exploited to project domain-specific motion to the canonical space. The edited motion is projected back to the domain of the source with a transformer. Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint. Besides, we contribute a cartoon dataset in Disney style. Extensive evaluations demonstrate the superiority of our method over competing methods.

Cite

Text

Gong et al. "ToonTalker: Cross-Domain Face Reenactment." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00707

Markdown

[Gong et al. "ToonTalker: Cross-Domain Face Reenactment." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/gong2023iccv-toontalker/) doi:10.1109/ICCV51070.2023.00707

BibTeX

@inproceedings{gong2023iccv-toontalker,
  title     = {{ToonTalker: Cross-Domain Face Reenactment}},
  author    = {Gong, Yuan and Zhang, Yong and Cun, Xiaodong and Yin, Fei and Fan, Yanbo and Wang, Xuan and Wu, Baoyuan and Yang, Yujiu},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {7690-7700},
  doi       = {10.1109/ICCV51070.2023.00707},
  url       = {https://mlanthology.org/iccv/2023/gong2023iccv-toontalker/}
}