Hu, Wenping

1 publications

ICLR 2026 Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Runze Liu, Jiakang Wang, Yuling Shi, Zhihui Xie, Chenxin An, Kaiyan Zhang, Jian Zhao, Xiaodong Gu, Lei Lin, Wenping Hu, Xiu Li, Fuzheng Zhang, Guorui Zhou, Kun Gai