3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions

Abstract

We present 3D Highlighter, a technique for localizing semantic regions on a mesh using text as input. A key feature of our system is the ability to interpret "out-of-domain" localizations. Our system demonstrates the ability to reason about where to place non-obviously related concepts on an input 3D shape, such as adding clothing to a bare 3D animal model. Our method contextualizes the text description using a neural field and colors the corresponding region of the shape using a probability-weighted blend. Our neural optimization is guided by a pre-trained CLIP encoder, which bypasses the need for any 3D datasets or 3D annotations. Thus, 3D Highlighter is highly flexible, general, and capable of producing localizations on a myriad of input shapes.

Cite

Text

Decatur et al. "3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02005

Markdown

[Decatur et al. "3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/decatur2023cvpr-3d/) doi:10.1109/CVPR52729.2023.02005

BibTeX

@inproceedings{decatur2023cvpr-3d,
  title     = {{3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions}},
  author    = {Decatur, Dale and Lang, Itai and Hanocka, Rana},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {20930-20939},
  doi       = {10.1109/CVPR52729.2023.02005},
  url       = {https://mlanthology.org/cvpr/2023/decatur2023cvpr-3d/}
}