ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

Ouyang, Siru; Yan, Jun; Hsu, I-Hung; Chen, Yanfei; Jiang, Ke; Wang, Zifeng; Han, Rujun; Le, Long; Daruki, Samira; Tang, Xiangru; Tirumalashetty, Vishy; Lee, George; Rofouei, Mahsan; Lin, Hangfei; Han, Jiawei; Lee, Chen-Yu; Pfister, Tomas

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, Tomas Pfister

ICLR 2026

/iclr/2026/ouyang2026iclr-reasoningbank/

Abstract

With the growing adoption of large language model agents in persistent real-world roles, they naturally encounter continuous streams of tasks. A key limitation, however, is their failure to learn from the accumulated interaction history, forcing them to discard valuable insights and repeat past errors. We propose ReasoningBank, a novel memory framework that distills generalizable reasoning strategies from an agent's self-judged successful and failed experiences. At test time, an agent retrieves relevant memories from ReasoningBank to inform its interaction and then integrates new learnings back, enabling it to become more capable over time. Building on this powerful experience learner, we further introduce memory-aware test-time scaling (MaTTS), which accelerates and diversifies this learning process by scaling up the agent's interaction experience. By allocating more compute to each task, the agent generates abundant, diverse experiences that provide rich contrastive signals for synthesizing higher-quality memory. The better memory in turn guides more effective scaling, establishing a powerful synergy between memory and test-time scaling. Across web browsing and software engineering benchmarks, ReasoningBank consistently outperforms existing memory mechanisms that store raw trajectories or only successful task routines, improving both effectiveness and efficiency; MaTTS further amplifies these gains. These findings establish memory-driven experience scaling as a new scaling dimension, enabling agents to self-evolve with emergent behaviors naturally arise. Our code can be found at https://github.com/google-research/reasoning-bank.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Ouyang et al. "ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory." International Conference on Learning Representations, 2026.

Markdown

[Ouyang et al. "ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/ouyang2026iclr-reasoningbank/)

BibTeX

@inproceedings{ouyang2026iclr-reasoningbank,
  title     = {{ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory}},
  author    = {Ouyang, Siru and Yan, Jun and Hsu, I-Hung and Chen, Yanfei and Jiang, Ke and Wang, Zifeng and Han, Rujun and Le, Long and Daruki, Samira and Tang, Xiangru and Tirumalashetty, Vishy and Lee, George and Rofouei, Mahsan and Lin, Hangfei and Han, Jiawei and Lee, Chen-Yu and Pfister, Tomas},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/ouyang2026iclr-reasoningbank/}
}