3D Surface Super-Resolution from Enhanced 2D Normal Images: A Multimodal-Driven Variational AutoEncoder Approach
Abstract
3D surface super-resolution is an important technical tool in virtual reality, and it is also a research hotspot in computer vision. Due to the unstructured and irregular nature of 3D object data, it is usually difficult to obtain high-quality surface details and geometry textures via a low-cost hardware setup. In this paper, we establish a multimodal-driven variational autoencoder (mmVAE) framework to perform 3D surface enhancement based on 2D normal images. To fully leverage the multimodal learning, we investigate a multimodal Gaussian mixture model (mmGMM) to align and fuse the latent feature representations from different modalities, and further propose a cross-scale encoder-decoder structure to reconstruct high-resolution normal images. Experimental results on several benchmark datasets demonstrate that our method delivers promising surface geometry structures and details in comparison with competitive advances.
Cite
Text
Xie et al. "3D Surface Super-Resolution from Enhanced 2D Normal Images: A Multimodal-Driven Variational AutoEncoder Approach." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/175Markdown
[Xie et al. "3D Surface Super-Resolution from Enhanced 2D Normal Images: A Multimodal-Driven Variational AutoEncoder Approach." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/xie2023ijcai-d/) doi:10.24963/IJCAI.2023/175BibTeX
@inproceedings{xie2023ijcai-d,
title = {{3D Surface Super-Resolution from Enhanced 2D Normal Images: A Multimodal-Driven Variational AutoEncoder Approach}},
author = {Xie, Wuyuan and Huang, Tengcong and Wang, Miaohui},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2023},
pages = {1578-1586},
doi = {10.24963/IJCAI.2023/175},
url = {https://mlanthology.org/ijcai/2023/xie2023ijcai-d/}
}