3D Instances as 1d Kernels

Abstract

We introduce a 3D instance representation, termed instance kernels, where instances are represented by one-dimensional vectors that encode the semantic, positional, and shape information of 3D instances. We show that instance kernels enable easy mask inference by simply scanning kernels over the entire scenes, avoiding the heavy reliance on proposals or heuristic clustering algorithms in standard 3D instance segmentation pipelines. The idea of instance kernel is inspired by recent success of dynamic convolutions in 2D/3D instance segmentation. However, we find it non-trivial to represent 3D instances due to the disordered and unstructured nature of point cloud data, e.g., poor instance localization can significantly degrade instance representation. To remedy this, we construct a novel 3D instance encoding paradigm. First, potential instance centroids are localized as candidates. Then, a candidate merging scheme is devised to simultaneously aggregate duplicated candidates and collect context around the merged centroids to form the instance kernels. Once instance kernels are available, instance masks can be reconstructed via dynamic convolutions whose weights are conditioned on instance kernels. The whole pipeline is instantiated with a dynamic kernel network (DKNet). Results show that DKNet outperforms the state of the arts on both ScanNetV2 and S3DIS datasets with better instance localization. Code is available: https://github.com/W1zheng/DKNet.

Cite

Text

Wu et al. "3D Instances as 1d Kernels." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19818-2_14

Markdown

[Wu et al. "3D Instances as 1d Kernels." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/wu2022eccv-3d/) doi:10.1007/978-3-031-19818-2_14

BibTeX

@inproceedings{wu2022eccv-3d,
  title     = {{3D Instances as 1d Kernels}},
  author    = {Wu, Yizheng and Shi, Min and Du, Shuaiyuan and Lu, Hao and Cao, Zhiguo and Zhong, Weicai},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19818-2_14},
  url       = {https://mlanthology.org/eccv/2022/wu2022eccv-3d/}
}