ML Anthology
Authors
Search
About
Zhao, Heyang
11 publications
ICLR
2025
Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration
Heyang Zhao
,
Xingrui Yu
,
David Mark Bossens
,
Ivor Tsang
,
Quanquan Gu
ICML
2025
Logarithmic Regret for Online KL-Regularized Reinforcement Learning
Heyang Zhao
,
Chenlu Ye
,
Wei Xiong
,
Quanquan Gu
,
Tong Zhang
NeurIPS
2025
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
,
Chenlu Ye
,
Quanquan Gu
,
Tong Zhang
NeurIPS
2024
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
Heyang Zhao
,
Jiafan He
,
Quanquan Gu
ICML
2024
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Xuheng Li
,
Heyang Zhao
,
Quanquan Gu
ICLR
2024
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
Qiwei Di
,
Heyang Zhao
,
Jiafan He
,
Quanquan Gu
NeurIPSW
2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
,
Chenlu Ye
,
Quanquan Gu
,
Tong Zhang
ICLR
2024
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Qiwei Di
,
Tao Jin
,
Yue Wu
,
Heyang Zhao
,
Farzad Farnoud
,
Quanquan Gu
ICML
2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
,
Heyang Zhao
,
Dongruo Zhou
,
Quanquan Gu
ICML
2023
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
Heyang Zhao
,
Dongruo Zhou
,
Jiafan He
,
Quanquan Gu
COLT
2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao
,
Jiafan He
,
Dongruo Zhou
,
Tong Zhang
,
Quanquan Gu