Fu, Wei

10 publications

NeurIPS 2025 AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Wei Fu, Jiaxuan Gao, Xujie Shen, Chen Zhu, Zhiyu Mei, Chuyi He, Shusheng Xu, Guo Wei, Jun Mei, Wang Jiashu, Tongkai Yang, Binhang Yuan, Yi Wu
NeurIPS 2025 How Far Are We from Optimal Reasoning Efficiency? Jiaxuan Gao, Shu Yan, Qixin Tan, Lu Yang, Shusheng Xu, Wei Fu, Zhiyu Mei, Kaifeng Lyu, Yi Wu
NeurIPS 2024 Hyper-Opinion Evidential Deep Learning for Out-of-Distribution Detection Jingen Qu, Yufei Chen, Xiaodong Yue, Wei Fu, Qiguang Huang
ICML 2024 Is DPO Superior to PPO for LLM Alignment? a Comprehensive Study Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu
ICLR 2024 SRL: Scaling Distributed Reinforcement Learning to over Ten Thousand Cores Zhiyu Mei, Wei Fu, Jiaxuan Gao, Guangju Wang, Huanchen Zhang, Yi Wu
NeurIPS 2023 Iteratively Learn Diverse Strategies with State Distance Information Wei Fu, Weihua Du, Jingwei Li, Sunli Chen, Jingzhao Zhang, Yi Wu
ICMLW 2023 SRL: Scaling Distributed Reinforcement Learning to over Ten Thousand Cores Zhiyu Mei, Wei Fu, Guangju Wang, Huanchen Zhang, Yi Wu
ICLR 2022 Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization Zihan Zhou, Wei Fu, Bingliang Zhang, Yi Wu
ICML 2022 Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning Wei Fu, Chao Yu, Zelai Xu, Jiaqi Yang, Yi Wu
NeurIPSW 2021 Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization Zihan Zhou, Wei Fu, Bingliang Zhang, Yi Wu