Peng, Runyu

1 publications

NeurIPS 2025 Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections Bo Wang, Qinyuan Cheng, Runyu Peng, Rong Bao, Peiji Li, Qipeng Guo, Linyang Li, Zhiyuan Zeng, Yunhua Zhou, Xipeng Qiu