ML Anthology
Authors
Search
About
RuibinZheng
1 publications
ICLR
2026
GEPO: Group Expectation Policy Optimization for Stable Heterogeneous Reinforcement Learning
Han Zhang
,
RuibinZheng
,
Zexuan Yi
,
Zhuo Zhang
,
Hanyang Peng
,
Hui Wang
,
Jiayin Qi
,
Binxing Fang
,
Ruifeng Xu
,
Yue Yu