Mei, Zhiyu

5 publications

NeurIPS 2025 AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Wei Fu, Jiaxuan Gao, Xujie Shen, Chen Zhu, Zhiyu Mei, Chuyi He, Shusheng Xu, Guo Wei, Jun Mei, Wang Jiashu, Tongkai Yang, Binhang Yuan, Yi Wu
NeurIPS 2025 How Far Are We from Optimal Reasoning Efficiency? Jiaxuan Gao, Shu Yan, Qixin Tan, Lu Yang, Shusheng Xu, Wei Fu, Zhiyu Mei, Kaifeng Lyu, Yi Wu
ICML 2024 Is DPO Superior to PPO for LLM Alignment? a Comprehensive Study Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu
ICLR 2024 SRL: Scaling Distributed Reinforcement Learning to over Ten Thousand Cores Zhiyu Mei, Wei Fu, Jiaxuan Gao, Guangju Wang, Huanchen Zhang, Yi Wu
ICMLW 2023 SRL: Scaling Distributed Reinforcement Learning to over Ten Thousand Cores Zhiyu Mei, Wei Fu, Guangju Wang, Huanchen Zhang, Yi Wu