ML Anthology
Authors
Search
About
Bao, Yilin
1 publications
ICLRW
2025
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Huaijie Wang
,
Shibo Hao
,
Hanze Dong
,
Shenao Zhang
,
Yilin Bao
,
Ziran Yang
,
Yi Wu