Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Abstract

3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal undertaking in autonomous driving aiming to predict the voxel occupancy within volumetric scenes. However prevailing methodologies primarily focus on voxel-wise feature aggregation while neglecting instance semantics and scene context. In this paper we present a novel paradigm termed Symphonies (Scene-from-Insts) that delves into the integration of instance queries to orchestrate 2D-to-3D reconstruction and 3D scene modeling. Leveraging our proposed Serial Instance-Propagated Attentions Symphonies dynamically encodes instance-centric semantics facilitating intricate interactions between the image and volumetric domains. Simultaneously Symphonies fosters holistic scene comprehension by capturing context through the efficient fusion of instance queries alleviating geometric ambiguities such as occlusion and perspective errors through contextual scene reasoning. Experimental results demonstrate that Symphonies achieves state-of-the-art performance on the challenging SemanticKITTI and SSCBench-KITTI-360 benchmarks yielding remarkable mIoU scores of 15.04 and 18.58 respectively. These results showcase the promising advancements of our paradigm. The code for our method is available at https://github.com/hustvl/Symphonies.

Cite

Text

Jiang et al. "Symphonize 3D Semantic Scene Completion with Contextual Instance Queries." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01915

Markdown

[Jiang et al. "Symphonize 3D Semantic Scene Completion with Contextual Instance Queries." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/jiang2024cvpr-symphonize/) doi:10.1109/CVPR52733.2024.01915

BibTeX

@inproceedings{jiang2024cvpr-symphonize,
  title     = {{Symphonize 3D Semantic Scene Completion with Contextual Instance Queries}},
  author    = {Jiang, Haoyi and Cheng, Tianheng and Gao, Naiyu and Zhang, Haoyang and Lin, Tianwei and Liu, Wenyu and Wang, Xinggang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {20258-20267},
  doi       = {10.1109/CVPR52733.2024.01915},
  url       = {https://mlanthology.org/cvpr/2024/jiang2024cvpr-symphonize/}
}