Xiong, Zhihan

11 publications

ICLRW 2025 Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel
ICLRW 2025 Language Model Preference Evaluation with Multiple Weak Evaluators Zhengyu Hu, Jieyu Zhang, Zhihan Xiong, Alexander Ratner, Hui Xiong, Ranjay Krishna
ICLR 2024 A Black-Box Approach for Non-Stationary Multi-Agent Reinforcement Learning Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du
AISTATS 2024 A/B Testing and Best-Arm Identification for Linear Bandits with Robustness to Non-Stationarity Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson
ICMLW 2024 Dual Approximation Policy Optimization Zhihan Xiong, Maryam Fazel, Lin Xiao
ICLR 2023 Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du
ICML 2022 Fourier Learning with Cyclical Data Yingxiang Yang, Zhihan Xiong, Tianyi Liu, Taiqing Wang, Chong Wang
NeurIPS 2022 Learning in Congestion Games with Bandit Feedback Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S Du
NeurIPS 2022 Near-Optimal Randomized Exploration for Tabular Markov Decision Processes Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S Du
NeurIPS 2021 Selective Sampling for Online Best-Arm Identification Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin G. Jamieson
AAAI 2020 Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning Tian Tan, Zhihan Xiong, Vikranth R. Dwaracherla