Neural Interactive Proofs

ICLR 2025

/iclr/2025/hammond2025iclr-neural/

Abstract

We consider the problem of how a trusted, but computationally bounded agent (a 'verifier') can learn to interact with one or more powerful but untrusted agents ('provers') in order to solve a given task. More specifically, we study the case in which agents are represented using neural networks and refer to solutions of this problem as neural interactive proofs. First we introduce a unifying framework based on prover-verifier games (Anil et al., 2021), which generalises previously proposed interaction protocols. We then describe several new protocols for generating neural interactive proofs, and provide a theoretical comparison of both new and existing approaches. Finally, we support this theory with experiments in two domains: a toy graph isomorphism problem that illustrates the key ideas, and a code validation task using large language models. In so doing, we aim to create a foundation for future work on neural interactive proofs and their application in building safer AI systems.

PDF ICLR Semantic Scholar

Cite

Text

Hammond and Adam-Day. "Neural Interactive Proofs." International Conference on Learning Representations, 2025.

Markdown

[Hammond and Adam-Day. "Neural Interactive Proofs." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/hammond2025iclr-neural/)

BibTeX

@inproceedings{hammond2025iclr-neural,
  title     = {{Neural Interactive Proofs}},
  author    = {Hammond, Lewis and Adam-Day, Sam},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/hammond2025iclr-neural/}
}