PSHuman: Photorealistic Single-Image 3D Human Reconstruction Using Cross-Scale Multiview Diffusion and Explicit Remeshing

Abstract

Photorealistic 3D human modeling is essential for various applications and has seen tremendous progress. However, existing methods for monocular full-body reconstruction, typically relying on front and/or predicted back view, still struggle with satisfactory performance due to the ill-posed nature of the problem and sophisticated self-occlusions. In this paper, we propose PSHuman, a novel framework that explicitly reconstructs human meshes utilizing priors from the multiview diffusion model. It is found that directly applying multiview diffusion on single-view human images leads to severe geometric distortions, especially on generated faces. To address it, we propose a cross-scale diffusion that models the joint probability distribution of global full-body shape and local facial characteristics, enabling identity-preserved novel-view generation without geometric distortion. Moreover, to enhance cross-view body shape consistency of varied human poses, we condition the generative model on parametric models (SMPL-X), which provide body priors and prevent unnatural views inconsistent with human anatomy. Leveraging the generated multiview normal and color images, we present SMPLX-initialized explicit human carving to recover realistic textured human meshes efficiently. Extensive experiments on CAPE and THuman2.1 demonstrate PSHuman's superiority in geometry details, texture fidelity, and generalization capability.

Cite

Text

Li et al. "PSHuman: Photorealistic Single-Image 3D Human Reconstruction Using Cross-Scale Multiview Diffusion and Explicit Remeshing." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01492

Markdown

[Li et al. "PSHuman: Photorealistic Single-Image 3D Human Reconstruction Using Cross-Scale Multiview Diffusion and Explicit Remeshing." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/li2025cvpr-pshuman/) doi:10.1109/CVPR52734.2025.01492

BibTeX

@inproceedings{li2025cvpr-pshuman,
  title     = {{PSHuman: Photorealistic Single-Image 3D Human Reconstruction Using Cross-Scale Multiview Diffusion and Explicit Remeshing}},
  author    = {Li, Peng and Zheng, Wangguandong and Liu, Yuan and Yu, Tao and Li, Yangguang and Qi, Xingqun and Chi, Xiaowei and Xia, Siyu and Cao, Yan-Pei and Xue, Wei and Luo, Wenhan and Guo, Yike},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {16008-16018},
  doi       = {10.1109/CVPR52734.2025.01492},
  url       = {https://mlanthology.org/cvpr/2025/li2025cvpr-pshuman/}
}