Tucker, George

38 publications

ICLR 2025 Training Language Models to Self-Correct via Reinforcement Learning Aviral Kumar, Vincent Zhuang, Rishabh Agarwal, Yi Su, John D Co-Reyes, Avi Singh, Kate Baumli, Shariq Iqbal, Colton Bishop, Rebecca Roelofs, Lei M Zhang, Kay McKinney, Disha Shrivastava, Cosmin Paduraru, George Tucker, Doina Precup, Feryal Behbahani, Aleksandra Faust
ICMLW 2023 Guided Evolution with Binary Predictors for ML Program Search John D Co-Reyes, Yingjie Miao, George Tucker, Aleksandra Faust, Esteban Real
ICLR 2023 Offline Q-Learning on Diverse Multi-Task Data Both Scales and Generalizes Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine
NeurIPSW 2023 Scaling Offline Q-Learning with Vision Transformers Yingjie Miao, Jordi Orbay, Rishabh Agarwal, Aviral Kumar, George Tucker, Aleksandra Faust
NeurIPS 2023 Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp
AISTATS 2022 Offline Policy Selection Under Uncertainty Mengjiao Yang, Bo Dai, Ofir Nachum, George Tucker, Dale Schuurmans
ICLR 2022 DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine
ICML 2022 Model Selection in Batch Policy Optimization Jonathan Lee, George Tucker, Ofir Nachum, Bo Dai
NeurIPSW 2022 Offline Q-Learning on Diverse Multi-Task Data Both Scales and Generalizes Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine
NeurIPSW 2022 Offline Q-Learning on Diverse Multi-Task Data Both Scales and Generalizes Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine
NeurIPS 2022 Oracle Inequalities for Model Selection in Offline Reinforcement Learning Jonathan N Lee, George Tucker, Ofir Nachum, Bo Dai, Emma Brunskill
ICLR 2021 Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization Michael R Zhang, Thomas Paine, Ofir Nachum, Cosmin Paduraru, George Tucker, Ziyu Wang, Mohammad Norouzi
ICLR 2021 Benchmarks for Deep Off-Policy Evaluation Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Thomas Paine
NeurIPS 2021 Coupled Gradient Estimators for Discrete Latent Variables Zhe Dong, Andriy Mnih, George Tucker
NeurIPSW 2021 DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine
NeurIPSW 2021 Offline Policy Selection Under Uncertainty Mengjiao Yang, Bo Dai, Ofir Nachum, George Tucker, Dale Schuurmans
NeurIPS 2020 Conservative Q-Learning for Offline Reinforcement Learning Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine
NeurIPS 2020 DisARM: An Antithetic Gradient Estimator for Binary Latent Variables Zhe Dong, Andriy Mnih, George Tucker
ICLR 2020 Meta-Learning Without Memorization Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn
ICLR 2020 Model Based Reinforcement Learning for Atari Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos, Błażej Osiński, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
NeurIPS 2019 Don't Blame the ELBO! a Linear VAE Perspective on Posterior Collapse James Lucas, George Tucker, Roger B Grosse, Mohammad Norouzi
ICLR 2019 Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison
NeurIPS 2019 Energy-Inspired Models: Learning with Sampler-Induced Distributions John Lawson, George Tucker, Bo Dai, Rajesh Ranganath
ICML 2019 Guided Evolutionary Strategies: Augmenting Random Search with Surrogate Gradients Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein
ICML 2019 On Variational Bounds of Mutual Information Ben Poole, Sherjil Ozair, Aaron Van Den Oord, Alex Alemi, George Tucker
ICLRW 2019 Revisiting Auxiliary Latent Variables in Generative Models Dieterich Lawson, George Tucker, Bo Dai, Rajesh Ranganath
NeurIPS 2019 Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction Aviral Kumar, Justin Fu, Matthew Soh, George Tucker, Sergey Levine
ICLR 2019 The Laplacian in RL: Learning Representations with Efficient Approximations Yifan Wu, George Tucker, Ofir Nachum
ICLRW 2019 Understanding Posterior Collapse in Generative Latent Variable Models James Lucas, George Tucker, Roger Grosse, Mohammad Norouzi
ICLR 2018 Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling Carlos Riquelme, George Tucker, Jasper Snoek
NeurIPS 2018 Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
ICML 2018 Smoothed Action Value Functions for Learning Gaussian Policies Ofir Nachum, Mohammad Norouzi, George Tucker, Dale Schuurmans
ICML 2018 The Mirage of Action-Dependent Baselines in Reinforcement Learning George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard Turner, Zoubin Ghahramani, Sergey Levine
NeurIPS 2017 Filtering Variational Objectives Chris J Maddison, John Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, Yee Teh
ICLR 2017 Particle Value Functions Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh
NeurIPS 2017 REBAR: Low-Variance, Unbiased Gradient Estimates for Discrete Latent Variable Models George Tucker, Andriy Mnih, Chris J Maddison, John Lawson, Jascha Sohl-Dickstein
ICLR 2017 REBAR: Low-Variance, Unbiased Gradient Estimates for Discrete Latent Variable Models George Tucker, Andriy Mnih, Chris J. Maddison, Jascha Sohl-Dickstein
ICLR 2017 Regularizing Neural Networks by Penalizing Confident Output Distributions Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, Geoffrey E. Hinton