Improving Retrieval-Augmented Generation Through Multi-Agent Reinforcement Learning

Chen, Yiqun; Yan, Lingyong; Sun, Weiwei; Ma, Xinyu; Zhang, Yi; Wang, Shuaiqiang; Yin, Dawei; Yang, Yiming; Mao, Jiaxin

Improving Retrieval-Augmented Generation Through Multi-Agent Reinforcement Learning

Yiqun Chen, Lingyong Yan, Weiwei Sun, Xinyu Ma, Yi Zhang, Shuaiqiang Wang, Dawei Yin, Yiming Yang, Jiaxin Mao

NeurIPS 2025

/neurips/2025/chen2025neurips-improving-a/

Abstract

Retrieval-augmented generation (RAG) is widely utilized to incorporate external knowledge into large language models, thereby enhancing factuality and reducing hallucinations in question-answering (QA) tasks. A standard RAG pipeline consists of several components, such as query rewriting, document retrieval, document filtering, and answer generation. However, these components are typically optimized separately through supervised fine-tuning, which can lead to misalignments between the objectives of individual components and the overarching aim of generating accurate answers. Although recent efforts have explored using reinforcement learning (RL) to optimize specific RAG components, these approaches often focus on simple pipelines with only two components or do not adequately address the complex interdependencies and collaborative interactions among the modules. To overcome these limitations, we propose treating the complex RAG pipeline with multiple components as a multi-agent cooperative task, in which each component can be regarded as an RL agent. Specifically, we present MMOA-RAG\footnote{The code of MMOA-RAG is on \url{https://github.com/chenyiqun/MMOA-RAG}.}, \textbf{M}ulti-\textbf{M}odule joint \textbf{O}ptimization \textbf{A}lgorithm for \textbf{RAG}, which employs multi-agent reinforcement learning to harmonize all agents' goals toward a unified reward, such as the F1 score of the final answer. Experiments conducted on various QA benchmarks demonstrate that MMOA-RAG effectively boost the overall performance of the pipeline and outperforms existing baselines. Furthermore, comprehensive ablation studies validate the contributions of individual components and demonstrate MMOA-RAG can be adapted to different RAG pipelines and benchmarks.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Chen et al. "Improving Retrieval-Augmented Generation Through Multi-Agent Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.

Markdown

[Chen et al. "Improving Retrieval-Augmented Generation Through Multi-Agent Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/chen2025neurips-improving-a/)

BibTeX

@inproceedings{chen2025neurips-improving-a,
  title     = {{Improving Retrieval-Augmented Generation Through Multi-Agent Reinforcement Learning}},
  author    = {Chen, Yiqun and Yan, Lingyong and Sun, Weiwei and Ma, Xinyu and Zhang, Yi and Wang, Shuaiqiang and Yin, Dawei and Yang, Yiming and Mao, Jiaxin},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/chen2025neurips-improving-a/}
}