Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

Lin, Yong; Tang, Shange; Lyu, Bohan; Yang, Ziran; Chung, Jui-Hui; Zhao, Haoyu; Jiang, Lai; Geng, Yihan; Ge, Jiawei; Sun, Jingruo; Wu, Jiayun; Gesi, Jiri; Lu, Ximing; Acuna, David; Yang, Kaiyu; Lin, Hongzhou; Choi, Yejin; Chen, Danqi; Arora, Sanjeev; Jin, Chi

Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

ICLR 2026

/iclr/2026/lin2026iclr-goedelproverv2/

Abstract

Automated theorem proving (ATP) --- the task of generating a proof that passes automated proof verification given a math question in formal language --- is a critical challenge at the intersection of mathematics and Artificial Intelligence (AI). We introduce Goedel-Prover-V2, a family of two language models that establish a new state-of-the-art (SOTA) in open-source ATP, using the Lean proof assistant. In addition to standard expert iteration and reinforcement learning, our approach incorporates three key innovations: (1) During training when improvement plateaus on human questions, the prover does scaffolded data synthesis to generate synthetic questions of increasing difficulty for its own training; (2) The prover is trained to self-correct using Lean compiler feedback; (3) Improved test-time exploration through checkpoint averaging to balance accuracy and diversity. Our small model, Goedel-Prover-V2-8B, reaches 84.6\% pass@32 on MiniF2F and outperforms DeepSeek-Prover-V2-671B despite being $80\times$ smaller. Our flagship model, Goedel-Prover-V2-32B, achieves 88.1\% on MiniF2F at pass@32 in standard mode and 90.4\% in self-correction mode, outperforming prior SOTA by a large margin. Additionally, our flagship model solves 86 problems on PutnamBench at pass@184, securing first place among open-source models and surpassing DeepSeek-Prover-V2-671B's record of 47 problems by pass@1024 with about $20\times$ smaller model size and significantly lower compute budget. Our models, code, and data are released at \url{https://github.com/Goedel-LM/Goedel-Prover-V2}.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Lin et al. "Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction." International Conference on Learning Representations, 2026.

Markdown

[Lin et al. "Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/lin2026iclr-goedelproverv2/)

BibTeX

@inproceedings{lin2026iclr-goedelproverv2,
  title     = {{Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction}},
  author    = {Lin, Yong and Tang, Shange and Lyu, Bohan and Yang, Ziran and Chung, Jui-Hui and Zhao, Haoyu and Jiang, Lai and Geng, Yihan and Ge, Jiawei and Sun, Jingruo and Wu, Jiayun and Gesi, Jiri and Lu, Ximing and Acuna, David and Yang, Kaiyu and Lin, Hongzhou and Choi, Yejin and Chen, Danqi and Arora, Sanjeev and Jin, Chi},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/lin2026iclr-goedelproverv2/}
}