Qiao, Jianglin

1 publications

ICLR 2026 SSVPO: Effective Step-Level Credit Assignment for RL Training of Language Models Yugu Li, Zehong Cao, Jianglin Qiao, Siyi Hu