Zhao, Heyang

11 publications

ICLR 2025 Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration Heyang Zhao, Xingrui Yu, David Mark Bossens, Ivor Tsang, Quanquan Gu
ICML 2025 Logarithmic Regret for Online KL-Regularized Reinforcement Learning Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang
NeurIPS 2025 Sharp Analysis for KL-Regularized Contextual Bandits and RLHF Heyang Zhao, Chenlu Ye, Quanquan Gu, Tong Zhang
NeurIPS 2024 A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation Heyang Zhao, Jiafan He, Quanquan Gu
ICML 2024 Feel-Good Thompson Sampling for Contextual Dueling Bandits Xuheng Li, Heyang Zhao, Quanquan Gu
ICLR 2024 Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning Qiwei Di, Heyang Zhao, Jiafan He, Quanquan Gu
NeurIPSW 2024 Sharp Analysis for KL-Regularized Contextual Bandits and RLHF Heyang Zhao, Chenlu Ye, Quanquan Gu, Tong Zhang
ICLR 2024 Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits Qiwei Di, Tao Jin, Yue Wu, Heyang Zhao, Farzad Farnoud, Quanquan Gu
ICML 2023 Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu
ICML 2023 Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits Heyang Zhao, Dongruo Zhou, Jiafan He, Quanquan Gu
COLT 2023 Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu