Animatable Gaussians: Learning Pose-Dependent Gaussian Maps for High-Fidelity Human Avatar Modeling

Abstract

Modeling animatable human avatars from RGB videos is a long-standing and challenging problem. Recent works usually adopt MLP-based neural radiance fields (NeRF) to represent 3D humans but it remains difficult for pure MLPs to regress pose-dependent garment details. To this end we introduce Animatable Gaussians a new avatar representation that leverages powerful 2D CNNs and 3D Gaussian splatting to create high-fidelity avatars. To associate 3D Gaussians with the animatable avatar we learn a parametric template from the input videos and then parameterize the template on two front & back canonical Gaussian maps where each pixel represents a 3D Gaussian. The learned template is adaptive to the wearing garments for modeling looser clothes like dresses. Such template-guided 2D parameterization enables us to employ a powerful StyleGAN-based CNN to learn the pose-dependent Gaussian maps for modeling detailed dynamic appearances. Furthermore we introduce a pose projection strategy for better generalization given novel poses. Overall our method can create lifelike avatars with dynamic realistic and generalized appearances. Experiments show that our method outperforms other state-of-the-art approaches. Code: https://github.com/lizhe00/AnimatableGaussians.

Cite

Text

Li et al. "Animatable Gaussians: Learning Pose-Dependent Gaussian Maps for High-Fidelity Human Avatar Modeling." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01864

Markdown

[Li et al. "Animatable Gaussians: Learning Pose-Dependent Gaussian Maps for High-Fidelity Human Avatar Modeling." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/li2024cvpr-animatable/) doi:10.1109/CVPR52733.2024.01864

BibTeX

@inproceedings{li2024cvpr-animatable,
  title     = {{Animatable Gaussians: Learning Pose-Dependent Gaussian Maps for High-Fidelity Human Avatar Modeling}},
  author    = {Li, Zhe and Zheng, Zerong and Wang, Lizhen and Liu, Yebin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {19711-19722},
  doi       = {10.1109/CVPR52733.2024.01864},
  url       = {https://mlanthology.org/cvpr/2024/li2024cvpr-animatable/}
}