Ji, Kaixuan

9 publications

ICLR 2026 Best-of-Majority: Minimax-Optimal Strategy for Pass@k Inference Scaling Qiwei Di, Kaixuan Ji, Xuheng Li, Heyang Zhao, Quanquan Gu
ICLR 2026 Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits Qingyue Zhao, Kaixuan Ji, Heyang Zhao, Tong Zhang, Quanquan Gu
TMLR 2025 Reinforcement Learning from Human Feedback with Active Queries Kaixuan Ji, Jiafan He, Quanquan Gu
ICLR 2025 Self-Play Preference Optimization for Language Model Alignment Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu
ICLR 2024 Horizon-Free Reinforcement Learning in Adversarial Linear Mixture MDPs Kaixuan Ji, Qingyue Zhao, Jiafan He, Weitong Zhang, Quanquan Gu
ICML 2024 Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu
NeurIPS 2024 Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu
ICMLW 2024 Self-Play Preference Optimization for Language Model Alignment Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu
NeurIPSW 2024 Self-Play Preference Optimization for Language Model Alignment Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu