Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
Abstract
Large language models have demonstrated impressive reasoning capabilities but are inherently limited by their knowledge reservoir. Retrieval-augmented reasoning mitigates this limitation by allowing LLMs to query external resources, but existing methods often retrieve irrelevant or noisy information, hindering accurate reasoning. In this paper, we propose **AutoRefine**, a reinforcement learning post-training framework that adopts a new "search-and-refine-during-think" paradigm. AutoRefine introduces explicit knowledge refinement steps between successive search calls, enabling the model to iteratively filter, distill, and organize evidence before generating an answer. Furthermore, we incorporate tailored retrieval-specific rewards alongside answer correctness rewards using group relative policy optimization. Experiments on single-hop and multi-hop QA benchmarks demonstrate that AutoRefine significantly outperforms existing approaches, particularly in complex, multi-hop reasoning scenarios. Detailed analysis shows that AutoRefine issues frequent, higher-quality searches and synthesizes evidence effectively.
Cite
Text
Shi et al. "Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning." Advances in Neural Information Processing Systems, 2025.Markdown
[Shi et al. "Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/shi2025neurips-search/)BibTeX
@inproceedings{shi2025neurips-search,
title = {{Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning}},
author = {Shi, Yaorui and Li, Sihang and Wu, Chang and Liu, Zhiyuan and Fang, Junfeng and Cai, Hengxing and Zhang, An and Wang, Xiang},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/shi2025neurips-search/}
}