Dai, Zhaoqian

1 publications

ICLR 2026 Selective Expert Guidance for Effective and Diverse Exploration in Reinforcement Learning of LLMs Zishang Jiang, Jinyi Han, Tingyun Li, Xinyi Wang, Sihang Jiang, Zhaoqian Dai, Ma Shuguang, Fei Yu, Jiaqing Liang, Yanghua Xiao