View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields

Abstract

Large-scale vision foundation models such as Segment Anything (SAM) demonstrate impressive performance in zero-shot image segmentation at multiple levels of granularity. However, these zero-shot predictions are rarely 3D-consistent. As the camera viewpoint changes in a scene, so do the segmentation predictions, as well as the characterizations of “coarse” or “fine” granularity. In this work, we address the challenging task of lifting multi-granular and view-inconsistent image segmentations into a hierarchical and 3D-consistent representation. We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene, whose segmentation structure can be revealed at different scales by simply using different thresholds on feature distance. Our key idea is to learn an ultrametric feature space, which unlike a Euclidean space, exhibits transitivity in distance-based grouping, naturally leading to a hierarchical clustering. Put together, our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output. We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency. We additionally provide qualitative examples of our model’s 3D hierarchical segmentations in real world scenes.1 1 The code and dataset are available at: feature_fields https://github.com/hardyho/ultrametric_

Cite

Text

He et al. "View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73004-7_16

Markdown

[He et al. "View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/he2024eccv-viewconsistent/) doi:10.1007/978-3-031-73004-7_16

BibTeX

@inproceedings{he2024eccv-viewconsistent,
  title     = {{View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields}},
  author    = {He, Haodi and Stearns, Colton and Harley, Adam and Guibas, Leonidas},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73004-7_16},
  url       = {https://mlanthology.org/eccv/2024/he2024eccv-viewconsistent/}
}