Liu, Jiacai

3 publications

ICLR 2025 $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee Wenye Li, Jiacai Liu, Ke Wei
NeurIPS 2025 DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization Jiacai Liu, Chaojie Wang, Chris Yuhao Liu, Liang Zeng, Rui Yan, Yiwen Sun, Yang Liu
JMLR 2025 On the Convergence of Projected Policy Gradient for Any Constant Step Sizes Jiacai Liu, Wenye Li, Dachao Lin, Ke Wei, Zhihua Zhang