TMVNet : Using Transformers for Multi-View Voxel-Based 3D Reconstruction

Peng, Kebin; Islam, Rifatul; Quarles, John; Desai, Kevin

doi:10.1109/CVPRW56347.2022.00036

TMVNet : Using Transformers for Multi-View Voxel-Based 3D Reconstruction

Kebin Peng, Rifatul Islam, John Quarles, Kevin Desai

CVPRW 2022 pp. 221-229

doi:10.1109/CVPRW56347.2022.00036 /cvprw/2022/peng2022cvprw-tmvnet/

Abstract

Previous research in multi-view 3D reconstruction have used different convolution neural network (CNN) architectures to obtain a 3D voxel representation. Even though CNN works well, they have limitations in exploiting the long-range dependencies in sequence transduction tasks such as multi-view 3D reconstruction. In this paper, we propose TMVNet–a two-layer transformer encoder that can better use long-range dependencies information. In contrast to using a 2D CNN decoder by the previous approaches, our model uses a 3D CNN encoder to capture the relations between the voxels in the 3D space. Also, our proposed 3D feature fusion network aggregates 3D position feature from CNN and long-range dependencies feature from transformer together. The proposed TMVNet is trained and tested on the ShapeNet dataset. Comparison against ten state-of-the-art multi-view 3D reconstruction methods and the reported quantitative and qualitative results show-case the superiority of our method.

CVPRW Semantic Scholar

Cite

Text

Peng et al. "TMVNet : Using Transformers for Multi-View Voxel-Based 3D Reconstruction." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00036

Markdown

[Peng et al. "TMVNet : Using Transformers for Multi-View Voxel-Based 3D Reconstruction." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/peng2022cvprw-tmvnet/) doi:10.1109/CVPRW56347.2022.00036

BibTeX

@inproceedings{peng2022cvprw-tmvnet,
  title     = {{TMVNet : Using Transformers for Multi-View Voxel-Based 3D Reconstruction}},
  author    = {Peng, Kebin and Islam, Rifatul and Quarles, John and Desai, Kevin},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {221-229},
  doi       = {10.1109/CVPRW56347.2022.00036},
  url       = {https://mlanthology.org/cvprw/2022/peng2022cvprw-tmvnet/}
}