ML Anthology
Authors
Search
About
Yao, Chaorui
1 publications
ICLR
2026
Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends
Chaorui Yao
,
Yanxi Chen
,
Yuchang Sun
,
Yushuo Chen
,
Wenhao Zhang
,
Xuchen Pan
,
Yaliang Li
,
Bolin Ding