Wan, Fanqi

6 publications

ICLR 2026 SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models Ziyi Yang, Weizhou Shen, Chenliang Li, Ruijun Chen, Fanqi Wan, Ming Yan, Xiaojun Quan, Fei Huang
ICLR 2026 SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization Huashan Sun, Shengyi Liao, Yansen Han, Yu Bai, Yang Gao, Cheng Fu, Weizhou Shen, Fanqi Wan, Ming Yan, Ji Zhang, Fei Huang
ICLR 2025 Advantage-Guided Distillation for Preference Alignment in Small Language Models Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang
AAAI 2025 Empowering Self-Learning of LLMs: Inner Knowledge Explicitation as a Catalyst Shijue Huang, Wanjun Zhong, Deng Cai, Fanqi Wan, Chengyi Wang, Mingxuan Wang, Mu Qiao, Ruifeng Xu
ICLR 2025 Weighted-Reward Preference Optimization for Implicit Model Fusion Ziyi Yang, Fanqi Wan, Longguang Zhong, Tianyuan Shi, Xiaojun Quan
ICLR 2024 Knowledge Fusion of Large Language Models Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi