Gao, Shiping

2 publications

ICLR 2025 Advantage-Guided Distillation for Preference Alignment in Small Language Models Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang
ICML 2025 Discriminative Policy Optimization for Token-Level Reward Models Hongzhan Chen, Tao Yang, Shiping Gao, Ruijun Chen, Xiaojun Quan, Hongtao Tian, Ting Yao