Guo, Zhaohan Daniel

12 publications

AISTATS 2025 A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Avila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana L Borsa, Arthur Guez, Will Dabney
JMLR 2025 Optimizing Return Distributions with Distributional Dynamic Programming Bernardo Ávila Pires, Mark Rowland, Diana Borsa, Zhaohan Daniel Guo, Khimya Khetarpal, André Barreto, David Abel, Rémi Munos, Will Dabney
NeurIPSW 2024 A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Avila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana L Borsa, Arthur Guez, Will Dabney
ICML 2024 Generalized Preference Optimization: A Unified Approach to Offline Alignment Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Remi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Avila Pires, Bilal Piot
ICML 2024 Human Alignment of Large Language Models Through Online Preference Optimisation Daniele Calandriello, Zhaohan Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot
ICML 2024 Nash Learning from Human Feedback Remi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J Mankowitz, Doina Precup, Bilal Piot
ICML 2023 Representations and Exploration for Deep Reinforcement Learning Using Singular Value Decomposition Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana L Borsa
ICML 2023 Understanding Self-Predictive Learning for Reinforcement Learning Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Avila Pires, Yash Chandak, Remi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko
NeurIPSW 2022 BLaDE: Robust Exploration via Diffusion Models Bilal Piot, Zhaohan Daniel Guo, Shantanu Thakoor, Mohammad Gheshlaghi Azar
ICML 2020 Agent57: Outperforming the Atari Human Benchmark Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Charles Blundell
ICML 2020 Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning Zhaohan Daniel Guo, Bernardo Avila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Remi Munos, Mohammad Gheshlaghi Azar
AISTATS 2016 A PAC RL Algorithm for Episodic POMDPs Zhaohan Daniel Guo, Shayan Doroudi, Emma Brunskill