Yao, Hengshuai

13 publications

AAAI 2023 The Sufficiency of Off-Policyness and Soft Clipping: PPO Is Still Insufficient According to an Off-Policy Measure Xing Chen, Dongcui Diao, Hechang Chen, Hengshuai Yao, Haiyin Piao, Zhixiao Sun, Zhiwei Yang, Randy Goebel, Bei Jiang, Yi Chang
UAI 2022 Understanding and Mitigating the Limitations of Prioritized Experience Replay Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo
ICML 2021 Breaking the Deadly Triad with a Target Network Shangtong Zhang, Hengshuai Yao, Shimon Whiteson
ICML 2020 Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson
IJCAI 2020 Weakly Supervised Few-Shot Object Segmentation Using Co-Attention with Visual and Semantic Embeddings Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jägersand
AAAI 2019 ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search Shangtong Zhang, Hengshuai Yao
ICML 2019 Distributional Reinforcement Learning for Efficient Exploration Borislav Mavrin, Hengshuai Yao, Linglong Kong, Kaiwen Wu, Yaoliang Yu
IJCAI 2019 Hill Climbing on Value Estimates for Search-Control in Dyna Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White
AAAI 2019 QUOTA: The Quantile Option Architecture for Reinforcement Learning Shangtong Zhang, Hengshuai Yao
NeurIPS 2014 Universal Option Models Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar
AAAI 2012 Approximate Policy Iteration with Linear Action Models Hengshuai Yao, Csaba Szepesvári
NeurIPS 2009 Multi-Step Dyna Planning for Policy Evaluation and Control Hengshuai Yao, Shalabh Bhatnagar, Dongcui Diao, Richard S. Sutton, Csaba Szepesvári
ICML 2008 Preconditioned Temporal Difference Learning Hengshuai Yao, Zhi-Qiang Liu