ShapeFormer: Transformer-Based Shape Completion via Sparse Representation

Abstract

We present ShapeFormer, a transformer-based network that produces a distribution of object completions, conditioned on incomplete, and possibly noisy, point clouds. The resultant distribution can then be sampled to generate likely completions, each of which exhibits plausible shape details, while being faithful to the input. To facilitate the use of transformers for 3D, we introduce a compact 3D representation, vector quantized deep implicit function (VQDIF), that utilizes spatial sparsity to represent a close approximation of a 3D shape by a short sequence of discrete variables. Experiments demonstrate that ShapeFormer outperforms prior art for shape completion from ambiguous partial inputs in terms of both completion quality and diversity. We also show that our approach effectively handles a variety of shape types, incomplete patterns, and real-world scans.

Cite

Text

Yan et al. "ShapeFormer: Transformer-Based Shape Completion via Sparse Representation." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00614

Markdown

[Yan et al. "ShapeFormer: Transformer-Based Shape Completion via Sparse Representation." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/yan2022cvpr-shapeformer/) doi:10.1109/CVPR52688.2022.00614

BibTeX

@inproceedings{yan2022cvpr-shapeformer,
  title     = {{ShapeFormer: Transformer-Based Shape Completion via Sparse Representation}},
  author    = {Yan, Xingguang and Lin, Liqiang and Mitra, Niloy J. and Lischinski, Dani and Cohen-Or, Daniel and Huang, Hui},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {6239-6249},
  doi       = {10.1109/CVPR52688.2022.00614},
  url       = {https://mlanthology.org/cvpr/2022/yan2022cvpr-shapeformer/}
}