Liu, Jiacai

4 publications

ICLR 2026 Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Chris Yuhao Liu, Liang Zeng, Yuzhen Xiao, Jujie He, Jiacai Liu, Chaojie Wang, Rui Yan, Wei Shen, Fuxiang Zhang, Jiacheng Xu, Yang Liu
ICLR 2025 $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee Wenye Li, Jiacai Liu, Ke Wei
NeurIPS 2025 DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization Jiacai Liu, Chaojie Wang, Chris Yuhao Liu, Liang Zeng, Rui Yan, Yiwen Sun, Yang Liu
JMLR 2025 On the Convergence of Projected Policy Gradient for Any Constant Step Sizes Jiacai Liu, Wenye Li, Dachao Lin, Ke Wei, Zhihua Zhang