ML Anthology
Authors
Search
About
Dong, Hande
2 publications
ICLR
2026
Scheduling Your LLM Reinforcement Learning with Reasoning Trees
Hong Wang
,
Zhezheng Hao
,
Jian Luo
,
Chenxing Wei
,
Yao Shu
,
Lei Liu
,
Cheaterlin
,
Hande Dong
,
Jiawei Chen
NeurIPS
2025
ReDit: Reward Dithering for Improved LLM Policy Optimization
Chenxing Wei
,
Jiarui Yu
,
Ying Tiffany He
,
Hande Dong
,
Yao Shu
,
Fei Yu