Liu, Xiaojiang
9 publications
ICLR
2025
TIS-DPO: Token-Level Importance Sampling for Direct Preference Optimization with Estimated Weights
Aiwei Liu, Haoping Bai, Zhiyun Lu, Yanchao Sun, Xiang Kong, Xiaoming Simon Wang, Jiulong Shan, Albin Madappally Jose, Xiaojiang Liu, Lijie Wen, Philip S. Yu, Meng Cao