How to Enable LLM with 3D Capacity? a Survey of Spatial Reasoning in LLM
Abstract
3D spatial understanding is essential in real-world applications such as robotics, autonomous vehicles, virtual reality, and medical imaging. Recently, Large Language Models (LLMs), having demonstrated remarkable success across various domains, have been leveraged to enhance 3D understanding tasks, showing potential to surpass traditional computer vision methods. In this survey, we present a comprehensive review of methods integrating LLMs with 3D spatial understanding. We propose a taxonomy that categorizes existing methods into three branches: image-based methods deriving 3D understanding from 2D visual data, point cloud-based methods working directly with 3D representations, and hybrid modality-based methods combining multiple data streams. We systematically review representative methods along these categories, covering data representations, architectural modifications, and training strategies that bridge textual and 3D modalities. Finally, we discuss current limitations, including dataset scarcity and computational challenges, while highlighting promising research directions in spatial perception, multi-modal fusion, and real-world applications.
Cite
Text
Zha et al. "How to Enable LLM with 3D Capacity? a Survey of Spatial Reasoning in LLM." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/1200Markdown
[Zha et al. "How to Enable LLM with 3D Capacity? a Survey of Spatial Reasoning in LLM." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/zha2025ijcai-enable/) doi:10.24963/IJCAI.2025/1200BibTeX
@inproceedings{zha2025ijcai-enable,
title = {{How to Enable LLM with 3D Capacity? a Survey of Spatial Reasoning in LLM}},
author = {Zha, Jirong and Fan, Yuxuan and Yang, Xiao and Gao, Chen and Chen, Xinlei},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {10817-10825},
doi = {10.24963/IJCAI.2025/1200},
url = {https://mlanthology.org/ijcai/2025/zha2025ijcai-enable/}
}