Zahavy, Tom

35 publications

NeurIPS 2025 Generating Creative Chess Puzzles Xidong Feng, Vivek Veeriah, Marcus Chiam, Michael D Dennis, Federico Barbero, Johan Obando-Ceron, Jiaxin Shi, Satinder Singh, Shaobo Hou, Nenad Tomasev, Tom Zahavy
ICML 2025 Mastering Board Games by External and Internal Planning with Language Models John Schultz, Jakub Adamek, Matej Jusup, Marc Lanctot, Michael Kaisers, Sarah Perrin, Daniel Hennes, Jeremy Shar, Cannada A. Lewis, Anian Ruoss, Tom Zahavy, Petar Veličković, Laurel Prince, Satinder Singh, Eric Malmi, Nenad Tomasev
ICLR 2023 Discovering Evolution Strategies via Meta-Black-Box Optimization Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dalibard, Chris Lu, Satinder Singh, Sebastian Flennerhag
ICLR 2023 Discovering Policies with DOMiNO: Diversity Optimization Maintaining near Optimality Tom Zahavy, Yannick Schroecker, Feryal Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh
NeurIPS 2023 Optimistic Meta-Gradients Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado P van Hasselt, András György, Satinder P. Singh
TMLR 2023 POMRL: No-Regret Learning-to-Plan with Increasing Horizons Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy
NeurIPSW 2023 POMRL: No-Regret Learning-to-Plan with Increasing Horizons Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy
ICML 2023 ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs Ted Moskovitz, Brendan O’Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy
ICLR 2022 Bootstrapped Meta-Learning Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh
CoLLAs 2022 Meta-Gradients in Non-Stationary Environments Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh
ICLRW 2022 Meta-Gradients in Non-Stationary Environments Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh
AAAI 2022 Online Apprenticeship Learning Lior Shani, Tom Zahavy, Shie Mannor
NeurIPSW 2022 Optimistic Meta-Gradients Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh
NeurIPS 2022 PaLM up: Playing in the Latent Manifold for Unsupervised Pretraining Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder P. Singh
ICLR 2021 Balancing Constraints and Rewards with Meta-Gradient D4PG Dan A. Calian, Daniel J Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann
NeurIPSW 2021 Bootstrapped Meta-Learning Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh
ICMLW 2021 Discovering Diverse Nearly Optimal Policies with Successor Features Tom Zahavy, Brendan O'Donoghue, Andre Barreto, Sebastian Flennerhag, Volodymyr Mnih, Satinder Singh
ICLR 2021 Discovering a Set of Policies for the Worst Case Reward Tom Zahavy, Andre Barreto, Daniel J Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh
NeurIPS 2021 Discovery of Options via Meta-Learned Subgoals Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado P van Hasselt, David Silver, Satinder P. Singh
ICML 2021 Emphatic Algorithms for Deep Reinforcement Learning Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado Van Hasselt
MLJ 2021 Inverse Reinforcement Learning in Contextual MDPs Stav Belogolovsky, Philip Korsunsky, Shie Mannor, Chen Tessler, Tom Zahavy
ICML 2021 Online Limited Memory Neural-Linear Bandits with Likelihood Matching Ofir Nabati, Tom Zahavy, Shie Mannor
NeurIPS 2021 Reward Is Enough for Convex MDPs Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder P. Singh
ICMLW 2021 Reward Is Enough for Convex MDPs Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh
NeurIPS 2020 A Self-Tuning Actor-Critic Algorithm Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado P van Hasselt, David Silver, Satinder P. Singh
AAAI 2020 Apprenticeship Learning via Frank-Wolfe Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour
ICLR 2020 Deep Randomized Least Squares Value Iteration Guy Adam, Tom Zahavy, Oron Anschel, Nahum Shimkin
MLHC 2020 Learning to Ask Medical Questions Using Reinforcement Learning Uri Shaham, Tom Zahavy, Cesar Caraballo, Shiwani Mahajan, Daisy Massey, Harlan Krumholz
ALT 2020 Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies Tom Zahavy, Avinatan Hasidim, Haim Kaplan, Yishay Mansour
UAI 2020 Unknown Mixing Times in Apprenticeship and Reinforcement Learning Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour
AAAI 2018 Is a Picture Worth a Thousand Words? a Deep Multi-Modal Architecture for Product Classification in E-Commerce Tom Zahavy, Abhinandan Krishnan, Alessandro Magnani, Shie Mannor
NeurIPS 2018 Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J Mankowitz, Shie Mannor
AAAI 2017 A Deep Hierarchical Approach to Lifelong Learning in Minecraft Chen Tessler, Shahar Givony, Tom Zahavy, Daniel J. Mankowitz, Shie Mannor
NeurIPS 2017 Shallow Updates for Deep Reinforcement Learning Nir Levine, Tom Zahavy, Daniel J Mankowitz, Aviv Tamar, Shie Mannor
ICML 2016 Graying the Black Box: Understanding DQNs Tom Zahavy, Nir Ben-Zrihem, Shie Mannor