Plan$^\ast$RAG: Efficient Test-Time Planning for Retrieval Augmented Generation
Abstract
We introduce Plan$^\ast$RAG, a novel framework that enables structured multi-hop reasoning in retrieval-augmented generation (RAG) through test-time reasoning plan generation. While existing approaches such as ReAct maintain reasoning chains within the language model's context window, we observe that this often leads to plan fragmentation and execution failures. Our key insight is that by isolating the reasoning plan as a directed acyclic graph (DAG) outside the LM's working memory, we can enable *(1)* systematic *exploration* of reasoning paths, *(2)* *atomic* subqueries enabling precise retrievals and grounding, and *(3)* *efficiency* through parallel execution and bounded context window utilization. Moreover, Plan$^\ast$RAG's modular design allows it to be integrated with existing RAG methods, thus providing a practical solution to improve current RAG systems. On standard multi-hop reasoning benchmarks, Plan$^\ast$RAG consistently achieves improvements over recently proposed methods such as RQ-RAG and Self-RAG, while maintaining comparable computational costs.
Cite
Text
Verma et al. "Plan$^\ast$RAG: Efficient Test-Time Planning for Retrieval Augmented Generation." ICLR 2025 Workshops: LLM_Reason_and_Plan, 2025.Markdown
[Verma et al. "Plan$^\ast$RAG: Efficient Test-Time Planning for Retrieval Augmented Generation." ICLR 2025 Workshops: LLM_Reason_and_Plan, 2025.](https://mlanthology.org/iclrw/2025/verma2025iclrw-plan/)BibTeX
@inproceedings{verma2025iclrw-plan,
title = {{Plan$^\ast$RAG: Efficient Test-Time Planning for Retrieval Augmented Generation}},
author = {Verma, Prakhar and Midigeshi, Sukruta Prakash and Sinha, Gaurav and Solin, Arno and Natarajan, Nagarajan and Sharma, Amit},
booktitle = {ICLR 2025 Workshops: LLM_Reason_and_Plan},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/verma2025iclrw-plan/}
}