PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

Ahmed Abdelreheem, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Abdelrahman Eldesokey, Peter Wonka, Gabriel Brostow, Sara Vicente, Guillermo Garcia-Hernando

ICCV 2025 pp. 6645-6655

/iccv/2025/abdelreheem2025iccv-placeit3d/

Abstract

We introduce the task of Language-Guided Object Placement in Real 3D Scenes. Given a 3D reconstructed point-cloud scene, a 3D asset, and a natural-language instruction, the goal is to place the asset so that the instruction is satisfied. The task demands tackling four intertwined challenges: (a) one-to-many ambiguity in valid placements; (b) precise geometric and physical reasoning; (c) joint understanding across the scene, the asset, and language; and (d) robustness to noisy point clouds with no privileged metadata at test time. The first three challenges mirror the complexities of synthetic scene generation, while the metadata-free, noisy-scan scenario is inherited from language-guided 3D visual grounding. We inaugurate this task by introducing a benchmark and evaluation protocol, releasing a dataset for training multi-modal large language models (MLLMs), and establishing a first nontrivial baseline. We believe this challenging setup and benchmark will provide a foundation for evaluating and advancing MLLMs in 3D understanding.

PDF ICCV Semantic Scholar

Cite

Text

Abdelreheem et al. "PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes." International Conference on Computer Vision, 2025.

Markdown

[Abdelreheem et al. "PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/abdelreheem2025iccv-placeit3d/)

BibTeX

@inproceedings{abdelreheem2025iccv-placeit3d,
  title     = {{PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes}},
  author    = {Abdelreheem, Ahmed and Aleotti, Filippo and Watson, Jamie and Qureshi, Zawar and Eldesokey, Abdelrahman and Wonka, Peter and Brostow, Gabriel and Vicente, Sara and Garcia-Hernando, Guillermo},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {6645-6655},
  url       = {https://mlanthology.org/iccv/2025/abdelreheem2025iccv-placeit3d/}
}