MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-Based Memory Agent
Abstract
Despite improvements by length extrapolation, efficient attention and memory modules, handling infinitely long documents without performance degradation during extrapolation remains the ultimate challenge in long-text processing. To solve this problem, We introduce a novel agent workflow, \method, which processes text in segments and updates memory through an overwrite strategy, addressing the challenge of long-context task through enhanced memory management. We further extend the DAPO algorithm to directly optimize memory ability in an end-to-end fashion, facilitating training via independent-context multi-conversation generation. Experimental results demonstrate that MemAgent has superb long-context capabilities, being able to extrapolate from an 8K context to a 3.5M QA task with a performance loss of less than 10\% and achieving over 95\% on the 512K NIAH test.
Cite
Text
Yu et al. "MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-Based Memory Agent." International Conference on Learning Representations, 2026.Markdown
[Yu et al. "MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-Based Memory Agent." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/yu2026iclr-memagent/)BibTeX
@inproceedings{yu2026iclr-memagent,
title = {{MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-Based Memory Agent}},
author = {Yu, Hongli and Chen, Tinghong and Feng, Jiangtao and Chen, Jiangjie and Dai, Weinan and Yu, Qiying and Zhang, Ya-Qin and Ma, Wei-Ying and Liu, Jingjing and Wang, Mingxuan and Zhou, Hao},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/yu2026iclr-memagent/}
}