Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information

Abstract

Geometry problem solving has garnered increasing attention due to its potential applications in intelligent education field. Inspired by the observation that text often introduces ambiguities that diagrams can clarify, this paper presents Pi-GPS, a novel framework that unleashes the power of diagrammatic information to resolve textual ambiguities, an aspect largely overlooked in prior research. Specifically, we design a micro module comprising a rectifier and verifier: the rectifier employs MLLMs to disambiguate text based on the diagrammatic context, while the verifier ensures the rectified output adherence to geometric rules, mitigating model hallucinations. Additionally, we explore the impact of LLMs in theorem predictor based on the disambiguated formal language. Empirical results demonstrate that Pi-GPS surpasses state-of-the-art models, achieving a nearly 10% improvement on Geometry3K over prior neural-symbolic approaches. We hope this work highlights the significance of resolving textual ambiguity in multimodal mathematical reasoning, a crucial factor limiting performance.

Cite

Text

Zhao et al. "Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information." International Conference on Computer Vision, 2025.

Markdown

[Zhao et al. "Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/zhao2025iccv-pigps/)

BibTeX

@inproceedings{zhao2025iccv-pigps,
  title     = {{Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information}},
  author    = {Zhao, Junbo and Zhang, Ting and Sun, Jiayu and Tian, Mi and Huang, Hua},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {1526-1536},
  url       = {https://mlanthology.org/iccv/2025/zhao2025iccv-pigps/}
}