StructLDM: Structured Latent Diffusion for 3D Human Generation

Abstract

Recent 3D human generative models have achieved remarkable progress by learning 3D-aware GANs from 2D images. However, existing 3D human generative methods model humans in a compact 1D latent space, ignoring the articulated structure and semantics of human body topology. In this paper, we explore more expressive and higher-dimensional latent space for 3D human modeling and propose , a diffusion-based unconditional 3D human generative model, which is learned from 2D images. solves the challenges imposed due to the high-dimensional growth of latent space with three key designs: 1) A semantic structured latent space defined on the dense surface manifold of a statistical human body template. 2) A structured 3D-aware auto-decoder that factorizes the global latent space into several semantic body parts parameterized by a set of conditional structured local NeRFs anchored to the body template, which embeds the properties learned from the 2D training data and can be decoded to render view-consistent humans under different poses and clothing styles. 3) A structured latent diffusion model for generative human appearance sampling. Extensive experiments validate ’s state-of-the-art generation performance and illustrate the expressiveness of the structured latent space over the well-adopted 1D latent space. Notably, enables different levels of controllable 3D human generation and editing, including pose/view/shape control, and high-level tasks including compositional generations, part-aware clothing editing, 3D virtual try-on, etc. Project page: taohuumd.github.io/projects/StructLDM.

Cite

Text

Hu et al. "StructLDM: Structured Latent Diffusion for 3D Human Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72983-6_21

Markdown

[Hu et al. "StructLDM: Structured Latent Diffusion for 3D Human Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/hu2024eccv-structldm/) doi:10.1007/978-3-031-72983-6_21

BibTeX

@inproceedings{hu2024eccv-structldm,
  title     = {{StructLDM: Structured Latent Diffusion for 3D Human Generation}},
  author    = {Hu, Tao and Hong, Fangzhou and Liu, Ziwei},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72983-6_21},
  url       = {https://mlanthology.org/eccv/2024/hu2024eccv-structldm/}
}