Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process

Abstract

We develop a generalized 3D shape generation prior model, tailored for multiple 3D tasks including unconditional shape generation, point cloud completion, and cross-modality shape generation, etc. On one hand, to precisely capture local fine detailed shape information, a vector quantized variational autoencoder (VQ-VAE) is utilized to index local geometry from a compactly learned codebook based on a broad set of task training data. On the other hand, a discrete diffusion generator is introduced to model the inherent structural dependencies among different tokens. In the meantime, a multi-frequency fusion module (MFM) is developed to suppress high-frequency shape feature fluctuations, guided by multi-frequency contextual information. The above designs jointly equip our proposed 3D shape prior model with high-fidelity, diverse features as well as the capability of cross-modality alignment, and extensive experiments have demonstrated superior performances on various 3D shape generation tasks.

Cite

Text

Li et al. "Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01610

Markdown

[Li et al. "Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-generalized/) doi:10.1109/CVPR52729.2023.01610

BibTeX

@inproceedings{li2023cvpr-generalized,
  title     = {{Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process}},
  author    = {Li, Yuhan and Dou, Yishun and Chen, Xuanhong and Ni, Bingbing and Sun, Yilin and Liu, Yutian and Wang, Fuzhen},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {16784-16794},
  doi       = {10.1109/CVPR52729.2023.01610},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-generalized/}
}