Xu, Derong
3 publications
NeurIPS
2025
Process vs. Outcome Reward: Which Is Better for Agentic RAG Reinforcement Learning
Wenlin Zhang, Xiangyang Li, Kuicai Dong, Yichao Wang, Pengyue Jia, Xiaopeng Li, Yingyi Zhang, Derong Xu, Zhaocheng Du, Huifeng Guo, Ruiming Tang, Xiangyu Zhao