Zhou, Enyu
6 publications
AAAI
2025
Alleviating Shifted Distribution in Human Preference Alignment Through Meta-Learning
Shihan Dou, Yan Liu, Enyu Zhou, Songyang Gao, Tianlong Li, Limao Xiong, Xin Zhao, Haoxiang Jia, Junjie Ye, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang NeurIPS
2025
Pre-Trained Policy Discriminators Are General Reward Models
Shihan Dou, Shichun Liu, Yuming Yang, Yicheng Zou, Yunhua Zhou, Shuhao Xing, Chenhao Huang, Qiming Ge, Haijun Lv, Demin Song, Songyang Gao, Chengqi Lyu, Enyu Zhou, Honglin Guo, Zhiheng Xi, Qipeng Guo, Wenwei Zhang, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Kai Chen ICLR
2025
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang