SeePhys: Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning

Abstract

We present SeePhys, a large-scale multimodal benchmark for LLM reasoning grounded in physics questions ranging from middle school to PhD qualifying exams. The benchmark covers 7 fundamental domains spanning the physics discipline, incorporating 21 categories of highly heterogeneous diagrams. In contrast to prior works where visual elements mainly serve auxiliary purposes, our benchmark features a substantial proportion of vision-essential problems (75%) that mandate visual information extraction for correct solutions. Through extensive evaluation, we observe that even the most advanced visual reasoning models (e.g., Gemini-2.5-pro and o4-mini) achieve sub-60% accuracy on our benchmark. These results reveal fundamental challenges in current large language models' visual understanding capabilities, particularly in: (i) establishing rigorous coupling between diagram interpretation and physics reasoning, and (ii) overcoming their persistent reliance on textual cues as cognitive shortcuts. Project Page: github.com/SeePhys/seephys-project Hugging Face: huggingface.co/datasets/SeePhys/SeePhys

Cite

Text

Xiang et al. "SeePhys:  Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning." Advances in Neural Information Processing Systems, 2025.

Markdown

[Xiang et al. "SeePhys:  Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/xiang2025neurips-seephys/)

BibTeX

@inproceedings{xiang2025neurips-seephys,
  title     = {{SeePhys:  Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning}},
  author    = {Xiang, Kun and Li, Heng and Zhang, Terry Jingchen and Huang, Yinya and Liu, Zirong and Qu, Peixin and He, Jixi and Chen, Jiaqi and Yuan, Yu-Jie and Han, Jianhua and Xu, Hang and Li, Hanhui and Sachan, Mrinmaya and Liang, Xiaodan},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/xiang2025neurips-seephys/}
}