Cao, Xiaoyang

1 publications

ICLR 2026 RE-PO: Robust Enhanced Policy Optimization as a General Framework for LLM Alignment Xiaoyang Cao, Zelai Xu, Mo Guang, Kaiwen Long, Michiel A. Bakker, Yu Wang, Chao Yu