Lv, Zhonghou

1 publications

IJCAI 2025 Indirect Online Preference Optimization via Reinforcement Learning En Wang, Xingyu Lin, Du Su, Chenfu Bao, Zhonghou Lv, Funing Yang, Yuanbo Xu, Wenbin Liu