Top-Down Beats Bottom-up in 3D Instance Segmentation

Abstract

Most 3D instance segmentation methods exploit a bottom-up strategy, typically including resource-exhaustive post-processing. For point grouping, bottom-up methods rely on prior assumptions about the objects in the form of hyperparameters, which are domain-specific and need to be carefully tuned. On the contrary, we address 3D instance segmentation with a TD3D: the pioneering cluster-free, fully-convolutional and entirely data-driven approach trained in an end-to-end manner. This is the first top-down method outperforming bottom-up approaches in 3D domain. With its straightforward pipeline, it performs outstandingly well on the standard benchmarks: ScanNet v2, its extension ScanNet200, and S3DIS. Besides, our method is much faster on inference than the current state-of-the-art grouping-based approaches: our flagship modification is 1.9x faster than the most accurate bottom-up method, while being more accurate, and our faster modification shows state-of-the-art accuracy running at 2.6x speed. Code is available at https://github.com/SamsungLabs/td3d.

Cite

Text

Kolodiazhnyi et al. "Top-Down Beats Bottom-up in 3D Instance Segmentation." Winter Conference on Applications of Computer Vision, 2024.

Markdown

[Kolodiazhnyi et al. "Top-Down Beats Bottom-up in 3D Instance Segmentation." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/kolodiazhnyi2024wacv-topdown/)

BibTeX

@inproceedings{kolodiazhnyi2024wacv-topdown,
  title     = {{Top-Down Beats Bottom-up in 3D Instance Segmentation}},
  author    = {Kolodiazhnyi, Maksim and Vorontsova, Anna and Konushin, Anton and Rukhovich, Danila},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2024},
  pages     = {3566-3574},
  url       = {https://mlanthology.org/wacv/2024/kolodiazhnyi2024wacv-topdown/}
}