PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data

Abstract

Segmenting 3D objects into parts is a long-standing challenge in computer vision. To overcome taxonomy constraints and generalize to unseen 3D objects, recent works turn to open-world part segmentation. These approaches typically transfer supervision from 2D foundation models, such as SAM, by lifting multi-view masks into 3D. However, this indirect paradigm fails to capture intrinsic geometry, leading to surface-only understanding, uncontrolled decomposition, and limited generalization. We present PartSAM, the first promptable part segmentation model trained natively on large-scale 3D data. Following the design philosophy of SAM, PartSAM employs an encoder–decoder architecture in which a triplane-based dual-branch encoder produces spatially structured tokens for scalable part-aware representation learning. To enable large-scale supervision, we further introduce a model-in-the-loop annotation pipeline that curates over five million 3D shape–part pairs from online assets, providing diverse and fine-grained labels. This combination of scalable architecture and diverse 3D data yields emergent open-world capabilities: with a single prompt, PartSAM achieves highly accurate part identification, and in a “Segment-Every-Part” mode, it automatically decomposes shapes into both surface and internal structures. Extensive experiments show that PartSAM outperforms state-of-the-art methods by large margins across multiple benchmarks, marking a decisive step toward foundation models for 3D part understanding.

Cite

Text

Zhu et al. "PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data." International Conference on Learning Representations, 2026.

Markdown

[Zhu et al. "PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhu2026iclr-partsam/)

BibTeX

@inproceedings{zhu2026iclr-partsam,
  title     = {{PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data}},
  author    = {Zhu, Zhe and Wan, Le and Xu, Rui and Zhang, Yiheng and Chen, Honghua and Dou, Zhiyang and Lin, Cheng and Liu, Yuan and Wei, Mingqiang},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhu2026iclr-partsam/}
}