High-Fidelity Human Avatars from a Single RGB Camera

Abstract

In this paper, we propose a coarse-to-fine framework to reconstruct a personalized high-fidelity human avatar from a monocular video. To deal with the misalignment problem caused by the changed poses and shapes in different frames, we design a dynamic surface network to recover pose-dependent surface deformations, which help to decouple the shape and texture of the person. To cope with the complexity of textures and generate photo-realistic results, we propose a reference-based neural rendering network and exploit a bottom-up sharpening-guided fine-tuning strategy to obtain detailed textures. Our framework also enables photo-realistic novel view/pose synthesis and shape editing applications. Experimental results on both the public dataset and our collected dataset demonstrate that our method outperforms the state-of-the-art methods. The code and dataset will be available at http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar.

Cite

Text

Zhao et al. "High-Fidelity Human Avatars from a Single RGB Camera." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01544

Markdown

[Zhao et al. "High-Fidelity Human Avatars from a Single RGB Camera." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/zhao2022cvpr-highfidelity/) doi:10.1109/CVPR52688.2022.01544

BibTeX

@inproceedings{zhao2022cvpr-highfidelity,
  title     = {{High-Fidelity Human Avatars from a Single RGB Camera}},
  author    = {Zhao, Hao and Zhang, Jinsong and Lai, Yu-Kun and Zheng, Zerong and Xie, Yingdi and Liu, Yebin and Li, Kun},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {15904-15913},
  doi       = {10.1109/CVPR52688.2022.01544},
  url       = {https://mlanthology.org/cvpr/2022/zhao2022cvpr-highfidelity/}
}