Yuan, Yurun

2 publications

ICML 2025 Reinforce LLM Reasoning Through Multi-Agent Reflection Yurun Yuan, Tengyang Xie
NeurIPS 2025 Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning Yurun Yuan, Fan Chen, Zeyu Jia, Alexander Rakhlin, Tengyang Xie