ML Anthology
Authors
Search
About
Yang, Saiyong
2 publications
ICLR
2026
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
Wenkai Yang
,
Weijie Liu
,
Ruobing Xie
,
Yiju Guo
,
Lulu Wu
,
Saiyong Yang
,
Yankai Lin
ICLR
2026
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Xin Xu
,
Clive Bai
,
Kai Yang
,
Tianhao Chen
,
Yang Wang
,
Saiyong Yang
,
Can Yang