What Makes Math Problems Hard for Reinforcement Learning: A Case Study

Ali Shehper, Anibal M. Medina-Mardones, Lucas Fagan, Bartłomiej Lewandowski, Angus Gruen, Yang Qiu, Piotr Kucharski, Zhenghan Wang, Sergei Gukov

NeurIPS 2025

/neurips/2025/shehper2025neurips-makes/

Abstract

Using a long-standing conjecture from combinatorial group theory, we explore, from multiple perspectives, the challenges of finding rare instances carrying disproportionately high rewards. Based on lessons learned in the context defined by the Andrews--Curtis conjecture, we analyze how reinforcement learning agents handle problems of varying hardness. We also address many mathematical questions as a part of our study. Notably, we demonstrate the length reducibility of all but two presentations in the Akbulut--Kirby series (1981), and resolve various potential counterexamples in the Miller--Schupp series (1991), including three infinite subfamilies.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Shehper et al. "What Makes Math Problems Hard for Reinforcement Learning: A Case Study." Advances in Neural Information Processing Systems, 2025.

Markdown

[Shehper et al. "What Makes Math Problems Hard for Reinforcement Learning: A Case Study." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/shehper2025neurips-makes/)

BibTeX

@inproceedings{shehper2025neurips-makes,
  title     = {{What Makes Math Problems Hard for Reinforcement Learning: A Case Study}},
  author    = {Shehper, Ali and Medina-Mardones, Anibal M. and Fagan, Lucas and Lewandowski, Bartłomiej and Gruen, Angus and Qiu, Yang and Kucharski, Piotr and Wang, Zhenghan and Gukov, Sergei},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/shehper2025neurips-makes/}
}