NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving

Abstract

Formal theorem proving is challenging for humans as well as for machines. Thanks to recent advances in LLM capabilities, we believe natural language can serve as a universal interface for reasoning about formal proofs. In this paper, 1) we introduce Pétanque, a new lightweight environment to interact with the Coq theorem prover; 2) we present two interactive proof protocols leveraging natural language as an intermediate representation for designing proof steps; 3) we implement beam search over these interaction protocols, using natural language to rerank proof candidates; and 4) we use Pétanque to benchmark our search algorithms. Using our method with GPT-4o we can successfully synthesize proofs for 58% of the first 100/260 lemmas from the newly published Busy Beaver proofs.

Cite

Text

Teodorescu et al. "NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving." NeurIPS 2024 Workshops: MATH-AI, 2024.

Markdown

[Teodorescu et al. "NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving." NeurIPS 2024 Workshops: MATH-AI, 2024.](https://mlanthology.org/neuripsw/2024/teodorescu2024neuripsw-nlir/)

BibTeX

@inproceedings{teodorescu2024neuripsw-nlir,
  title     = {{NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving}},
  author    = {Teodorescu, Laetitia and Baudart, Guillaume and Arias, Emilio Jesús Gallego and Lelarge, Marc},
  booktitle = {NeurIPS 2024 Workshops: MATH-AI},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/teodorescu2024neuripsw-nlir/}
}