High-Fidelity Human Avatars from a Single RGB Camera
Abstract
In this paper, we propose a coarse-to-fine framework to reconstruct a personalized high-fidelity human avatar from a monocular video. To deal with the misalignment problem caused by the changed poses and shapes in different frames, we design a dynamic surface network to recover pose-dependent surface deformations, which help to decouple the shape and texture of the person. To cope with the complexity of textures and generate photo-realistic results, we propose a reference-based neural rendering network and exploit a bottom-up sharpening-guided fine-tuning strategy to obtain detailed textures. Our framework also enables photo-realistic novel view/pose synthesis and shape editing applications. Experimental results on both the public dataset and our collected dataset demonstrate that our method outperforms the state-of-the-art methods. The code and dataset will be available at http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar.
Cite
Text
Zhao et al. "High-Fidelity Human Avatars from a Single RGB Camera." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01544Markdown
[Zhao et al. "High-Fidelity Human Avatars from a Single RGB Camera." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/zhao2022cvpr-highfidelity/) doi:10.1109/CVPR52688.2022.01544BibTeX
@inproceedings{zhao2022cvpr-highfidelity,
title = {{High-Fidelity Human Avatars from a Single RGB Camera}},
author = {Zhao, Hao and Zhang, Jinsong and Lai, Yu-Kun and Zheng, Zerong and Xie, Yingdi and Liu, Yebin and Li, Kun},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {15904-15913},
doi = {10.1109/CVPR52688.2022.01544},
url = {https://mlanthology.org/cvpr/2022/zhao2022cvpr-highfidelity/}
}