R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization
Abstract
Recently the authors of Zero-1-to-3 demonstrated that a latent diffusion model pretrained with Internet-scale data can not only address the single-view 3D object reconstruction task but can even attain SOTA results in it. However when applied to the task of single-view 3D clothed human reconstruction Zero-1-to-3 (and related models) are unable to compete with the corresponding SOTA methods in this field despite being trained on clothed human data. In this work we aim to tailor Zero-1-to-3's approach to the single-view 3D clothed human reconstruction task in a much more principled and structured manner. To this end we propose R-Cyclic Diffuser a framework that adapts Zero-1-to-3's novel approach to clothed human data by fusing it with a pixel-aligned implicit model. R-Cyclic Diffuser offers a total of three new contributions. The first and primary contribution is R-Cyclic Diffuser's cyclical conditioning mechanism for novel view synthesis. This mechanism directly addresses the view inconsistency problem faced by Zero-1-to-3 and related models. Secondly we further enhance this mechanism with two key features - Lateral Inversion Constraint and Cyclic Noise Selection. Both features are designed to regularize and restrict the randomness of outputs generated by a latent diffusion model. Thirdly we show how SMPL-X body priors can be incorporated in a latent diffusion model such that novel views of clothed human bodies can be generated much more accurately. Our experiments show that R-Cyclic Diffuser is able to outperform current SOTA methods in single-view 3D clothed human reconstruction both qualitatively and quantitatively. Our code is made publicly available at https://github.com/kcyt/r-cyclic-diffuser.
Cite
Text
Chan et al. "R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00981Markdown
[Chan et al. "R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/chan2024cvpr-rcyclic/) doi:10.1109/CVPR52733.2024.00981BibTeX
@inproceedings{chan2024cvpr-rcyclic,
title = {{R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization}},
author = {Chan, Kennard Yanting and Liu, Fayao and Lin, Guosheng and Foo, Chuan Sheng and Lin, Weisi},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {10304-10313},
doi = {10.1109/CVPR52733.2024.00981},
url = {https://mlanthology.org/cvpr/2024/chan2024cvpr-rcyclic/}
}