Hu, Wenping
1 publications
ICLR
2026
Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models
Runze Liu, Jiakang Wang, Yuling Shi, Zhihui Xie, Chenxin An, Kaiyan Zhang, Jian Zhao, Xiaodong Gu, Lei Lin, Wenping Hu, Xiu Li, Fuzheng Zhang, Guorui Zhou, Kun Gai