Wu, Lulu

1 publications

ICLR 2026 LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Wenkai Yang, Weijie Liu, Ruobing Xie, Yiju Guo, Lulu Wu, Saiyong Yang, Yankai Lin