Qi, Zhao

1 publications

TMLR 2025 QPO: Query-Dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning Yilun Kong, Hangyu Mao, Zhao Qi, Bin Zhang, Jingqing Ruan, Li Shen, Yongzhe Chang, Xueqian Wang, Rui Zhao, Dacheng Tao