Van Roy, Benjamin

56 publications

FnTML 2025 Continual Learning as Computationally Constrained Reinforcement Learning Saurabh Kumar, Henrik Marklund, Ashish Rao, Yifan Zhu, Hong Jun Jeon, Yueyang Liu, Benjamin Van Roy
ICML 2024 An Information-Theoretic Analysis of In-Context Learning Hong Jun Jeon, Jason D. Lee, Qi Lei, Benjamin Van Roy
ICML 2024 Efficient Exploration for LLMs Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy
NeurIPSW 2024 Information-Theoretic Foundations for Neural Scaling Laws Hong Jun Jeon, Benjamin Van Roy
CoLLAs 2024 Maintaining Plasticity in Continual Learning via Regenerative Regularization Saurabh Kumar, Henrik Marklund, Benjamin Van Roy
ICMLW 2024 RLHF and IIA: Perverse Incentives Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy
NeurIPS 2023 A Definition of Continual Reinforcement Learning David Abel, Andre Barreto, Benjamin Van Roy, Doina Precup, Hado P van Hasselt, Satinder P. Singh
UAI 2023 Approximate Thompson Sampling via Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy
TMLR 2023 Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy
NeurIPS 2023 Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy
ICML 2023 Leveraging Demonstrations to Improve Online Learning: Quality Matters Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen
AISTATS 2023 Nonstationary Bandit Learning via Predictive Sampling Yueyang Liu, Benjamin Van Roy, Kuang Xu
FnTML 2023 Reinforcement Learning, Bit by Bit Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen
NeurIPS 2022 An Analysis of Ensemble Sampling Chao Qin, Zheng Wen, Xiuyuan Lu, Benjamin Van Roy
NeurIPS 2022 An Information-Theoretic Framework for Deep Learning Hong Jun Jeon, Benjamin Van Roy
NeurIPS 2022 Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning Dilip Arumugam, Benjamin Van Roy
ICMLW 2022 Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning Dilip Arumugam, Benjamin Van Roy
UAI 2022 Evaluating High-Order Predictive Distributions in Deep Learning Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy
NeurIPSW 2022 On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning Dilip Arumugam, Mark K Ho, Noah Goodman, Benjamin Van Roy
JMLR 2022 Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States Shi Dong, Benjamin Van Roy, Zhengyuan Zhou
NeurIPS 2022 The Neural Testbed: Evaluating Joint Predictions Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Dieterich Lawson, Botao Hao, Brendan O'Donoghue, Benjamin Van Roy
ICML 2021 Deciding What to Learn: A Rate-Distortion Approach Dilip Arumugam, Benjamin Van Roy
NeurIPS 2021 The Value of Information When Deciding What to Learn Dilip Arumugam, Benjamin Van Roy
ICLR 2020 Behaviour Suite for Reinforcement Learning Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt
ICLR 2020 Hypermodels for Exploration Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy
NeurIPS 2020 On Efficiency in Hierarchical Reinforcement Learning Zheng Wen, Doina Precup, Morteza Ibrahimi, Andre Barreto, Benjamin Van Roy, Satinder P. Singh
JMLR 2019 Deep Exploration via Randomized Value Functions Ian Osband, Benjamin Van Roy, Daniel J. Russo, Zheng Wen
NeurIPS 2019 Information-Theoretic Confidence Bounds for Reinforcement Learning Xiuyuan Lu, Benjamin Van Roy
COLT 2019 On the Performance of Thompson Sampling on Logistic Bandits Shi Dong, Tengyu Ma, Benjamin Van Roy
FnTML 2018 A Tutorial on Thompson Sampling Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen
NeurIPS 2018 An Information-Theoretic Analysis for Thompson Sampling with Many Actions Shi Dong, Benjamin Van Roy
ICML 2018 Coordinated Exploration in Concurrent Reinforcement Learning Maria Dimakopoulou, Benjamin Van Roy
NeurIPS 2018 Scalable Coordinated Exploration in Concurrent Reinforcement Learning Maria Dimakopoulou, Ian Osband, Benjamin Van Roy
NeurIPS 2017 Conservative Contextual Linear Bandits Abbas Kazerouni, Mohammad Ghavamzadeh, Yasin Abbasi Yadkori, Benjamin Van Roy
NeurIPS 2017 Ensemble Sampling Xiuyuan Lu, Benjamin Van Roy
ICML 2017 Why Is Posterior Sampling Better than Optimism for Reinforcement Learning? Ian Osband, Benjamin Van Roy
JMLR 2016 An Information-Theoretic Analysis of Thompson Sampling Daniel Russo, Benjamin Van Roy
NeurIPS 2016 Deep Exploration via Bootstrapped DQN Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy
ICML 2016 Generalization and Exploration via Randomized Value Functions Ian Osband, Benjamin Van Roy, Zheng Wen
NeurIPS 2014 Learning to Optimize via Information-Directed Sampling Daniel Russo, Benjamin Van Roy
NeurIPS 2014 Model-Based Reinforcement Learning and the Eluder Dimension Ian Osband, Benjamin Van Roy
NeurIPS 2014 Near-Optimal Reinforcement Learning in Factored MDPs Ian Osband, Benjamin Van Roy
NeurIPS 2013 (More) Efficient Reinforcement Learning via Posterior Sampling Ian Osband, Daniel Russo, Benjamin Van Roy
NeurIPS 2013 Efficient Exploration and Value Function Generalization in Deterministic Systems Zheng Wen, Benjamin Van Roy
NeurIPS 2013 Eluder Dimension and the Sample Complexity of Optimistic Exploration Daniel Russo, Benjamin Van Roy
MLJ 2013 Learning a Factor Model via Regularized PCA Yi-Hao Kao, Benjamin Van Roy
NeurIPS 2005 Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, Benjamin Van Roy
MLJ 2002 On Average Versus Discounted Reward Temporal-Difference Learning John N. Tsitsiklis, Benjamin Van Roy
ICML 2001 A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal Difference Learning David Choi, Benjamin Van Roy
UAI 2001 A Tractable POMDP for Dynamic Sequencing with Applications to Personalized Internet Content Provision Paat Rusmevichientong, Benjamin Van Roy
ICML 2000 Fixed Points of Approximate Value Iteration and Temporal-Difference Learning Daniela Pucci de Farias, Benjamin Van Roy
NeurIPS 1999 An Analysis of Turbo Decoding with Gaussian Densities Paat Rusmevichientong, Benjamin Van Roy
NeurIPS 1996 Analysis of Temporal-Diffference Learning with Function Approximation John N. Tsitsiklis, Benjamin Van Roy
NeurIPS 1996 Approximate Solutions to Optimal Stopping Problems John N. Tsitsiklis, Benjamin Van Roy
MLJ 1996 Feature-Based Methods for Large Scale Dynamic Programming John N. Tsitsiklis, Benjamin Van Roy
NeurIPS 1995 Stable LInear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions Benjamin Van Roy, John N. Tsitsiklis