3d²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Tang, Zichen; Yang, Hongyu; Zhang, Hanchen; Chen, Jiaxin; Huang, Di

doi:10.1609/AAAI.V39I7.32789

3d²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Zichen Tang, Hongyu Yang, Hanchen Zhang, Jiaxin Chen, Di Huang

AAAI 2025 pp. 7338-7346

doi:10.1609/AAAI.V39I7.32789 /aaai/2025/tang2025aaai-d/

Abstract

Advancements in neural implicit representations and differentiable rendering have markedly improved the ability to learn animatable 3D avatars from sparse multi-view RGB videos. However, current methods that map observation space to canonical space often face challenges in capturing pose-dependent details and generalizing to novel poses. While diffusion models have demonstrated remarkable zero-shot capabilities in 2D image generation, their potential for creating animatable 3D avatars from 2D inputs remains underexplored. In this work, we introduce 3D²-Actor, a novel approach featuring a pose-conditioned 3D-aware human modeling pipeline that integrates iterative 2D denoising and 3D rectifying steps. The 2D denoiser, guided by pose cues, generates detailed multi-view images that provide the rich feature set necessary for high-fidelity 3D reconstruction and pose rendering. Complementing this, our Gaussian-based 3D rectifier renders images with enhanced 3D consistency through a two-stage projection strategy and a novel local coordinate representation. Additionally, we propose an innovative sampling strategy to ensure smooth temporal continuity across frames in video synthesis. Our method effectively addresses the limitations of traditional numerical solutions in handling ill-posed mappings, producing realistic and animatable 3D human avatars. Experimental results demonstrate that 3D²-Actor excels in high-fidelity avatar modeling and robustly generalizes to novel poses.

PDF AAAI Semantic Scholar

Cite

Text

Tang et al. "3d²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I7.32789

Markdown

[Tang et al. "3d²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/tang2025aaai-d/) doi:10.1609/AAAI.V39I7.32789

BibTeX

@inproceedings{tang2025aaai-d,
  title     = {{3d²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling}},
  author    = {Tang, Zichen and Yang, Hongyu and Zhang, Hanchen and Chen, Jiaxin and Huang, Di},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {7338-7346},
  doi       = {10.1609/AAAI.V39I7.32789},
  url       = {https://mlanthology.org/aaai/2025/tang2025aaai-d/}
}