Russell, Stuart

135 publications

ICML 2025 AssistanceZero: Scalably Solving Assistance Games Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
ICML 2025 Avoiding Catastrophe in Online Learning by Asking for Help Benjamin Plaut, Hanlin Zhu, Stuart Russell
ICLR 2025 BAMDP Shaping: A Unified Framework for Intrinsic Motivation and Reward Shaping Aly Lidayan, Michael D Dennis, Stuart Russell
ICLR 2025 Diffusion on Syntax Trees for Program Synthesis Shreyas Kapur, Erik Jenner, Stuart Russell
ICML 2025 Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts Jiahai Feng, Stuart Russell, Jacob Steinhardt
NeurIPS 2025 Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers Yixiao Huang, Hanlin Zhu, Tianyu Guo, Jiantao Jiao, Somayeh Sojoudi, Michael I. Jordan, Stuart Russell, Song Mei
ICLR 2025 Monitoring Latent World States in Language Models with Propositional Probes Jiahai Feng, Stuart Russell, Jacob Steinhardt
ICML 2025 Observation Interference in Partially Observable Assistance Games Scott Emmons, Caspar Oesterheld, Vincent Conitzer, Stuart Russell
UAI 2025 RL, but Don’t Do Anything I Wouldn’t Do Michael K. Cohen, Marcus Hutter, Yoshua Bengio, Stuart Russell
NeurIPS 2025 Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought Hanlin Zhu, Shibo Hao, Zhiting Hu, Jiantao Jiao, Stuart Russell, Yuandong Tian
NeurIPS 2025 Robust and Diverse Multi-Agent Learning via Rational Policy Gradient Niklas Lauffer, Ameesh Shah, Micah Carroll, Sanjit A. Seshia, Stuart Russell, Michael D Dennis
ICLRW 2025 Scalably Solving Assistance Games Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
AAAI 2025 The Partially Observable Off-Switch Game Andrew Garber, Rohan Subramani, Linus Luu, Mark Bedaywi, Stuart Russell, Scott Emmons
ICML 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICLRW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICMLW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICMLW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICMLW 2024 AssistanceZero: Scalably Solving Assistance Games Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
NeurIPSW 2024 Diffusion on Syntax Trees for Program Synthesis Shreyas Kapur, Erik Jenner, Stuart Russell
NeurIPS 2024 Evidence of Learned Look-Ahead in a Chess-Playing Neural Network Erik Jenner, Shreyas Kapur, Vasil Georgiev, Cameron Allen, Scott Emmons, Stuart Russell
ICML 2024 Image Hijacks: Adversarial Images Can Control Generative Models at Runtime Luke Bailey, Euan Ong, Stuart Russell, Scott Emmons
ICLR 2024 On Representation Complexity of Model-Based and Model-Free Reinforcement Learning Hanlin Zhu, Baihe Huang, Stuart Russell
ICML 2024 Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker
NeurIPSW 2024 Predicting Human Decisions with Behavioral Theories and Machine Learning Ori Plonsky, Reut Apel, Eyal Ert, Moshe Tennenholtz, David Bourgin, Joshua Peterson, Daniel Reichman, Thomas L. Griffiths, Stuart Russell, Evan Carter, James F. Cavanagh, Ido Erev
ICMLW 2024 Scalably Solving Assistance Games Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
ICLR 2024 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell
ICLR 2024 The Effective Horizon Explains Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
NeurIPS 2024 Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics Hanlin Zhu, Baihe Huang, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell
CoRL 2024 Trajectory Improvement and Reward Learning from Comparative Language Feedback Zhaojing Yang, Miru Jun, Jeremy Tien, Stuart Russell, Anca Dragan, Erdem Biyik
NeurIPS 2024 When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback Leon Lang, Davis Foote, Stuart Russell, Anca Dragan, Erik Jenner, Scott Emmons
NeurIPSW 2023 A Theoretical Explanation of Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
NeurIPSW 2023 A Theoretical Explanation of Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
ICML 2023 Adversarial Policies Beat Superhuman Go AIs Tony Tong Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell
ICMLW 2023 Bridging RL Theory and Practice with the Effective Horizon Cassidy Laidlaw, Stuart Russell, Anca Dragan
ICML 2023 Invariance in Policy Optimisation and Partial Identifiability in Reward Learning Joar Max Viktor Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave
NeurIPSW 2023 Mitigating Generative Agent Social Dilemmas Julian Yocum, Phillip J.K. Christoffersen, Mehul Damani, Justin Svegliato, Dylan Hadfield-Menell, Stuart Russell
ICLR 2023 Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao
AISTATS 2023 SMCP3: Sequential Monte Carlo with Probabilistic Program Proposals Alexander K. Lew, George Matheos, Tan Zhi-Xuan, Matin Ghavamizadeh, Nishad Gothoskar, Stuart Russell, Vikash K. Mansinghka
NeurIPSW 2023 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer, Olivia Watkins, Ethan Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell
NeurIPSW 2023 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell
ICML 2023 Who Needs to Know? Minimal Knowledge for Optimal Coordination Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael D Dennis, Stuart Russell
NeurIPSW 2022 Adversarial Policies Beat Professional-Level Go AIs Tony Tong Wang, Adam Gleave, Nora Belrose, Tom Tseng, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Joseph Miller, Sergey Levine, Stuart Russell
NeurIPSW 2022 Adversarial Policies Beat Professional-Level Go AIs Tony Tong Wang, Adam Gleave, Nora Belrose, Tom Tseng, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell
ICLR 2022 Cross-Domain Imitation Learning via Optimal Transport Arnaud Fickinger, Samuel Cohen, Stuart Russell, Brandon Amos
ICML 2022 Estimating and Penalizing Induced Preference Shifts in Recommender Systems Micah D Carroll, Anca Dragan, Stuart Russell, Dylan Hadfield-Menell
ICML 2022 For Learning in Symmetric Teams, Local Optima Are Global Nash Equilibria Scott Emmons, Caspar Oesterheld, Andrew Critch, Vincent Conitzer, Stuart Russell
ICLRW 2022 Graphical Clusterability and Local Specialization in Deep Neural Networks Stephen Casper, Shlomi Hod, Daniel Filan, Cody Wild, Andrew Critch, Stuart Russell
NeurIPSW 2021 Cross-Domain Imitation Learning via Optimal Transport Arnaud Fickinger, Samuel Cohen, Stuart Russell, Brandon Amos
ICMLW 2021 Explore and Control with Adversarial Surprise Arnaud Fickinger, Natasha Jaques, Samyak Parajuli, Michael Chang, Nicholas Rhinehart, Glen Berseth, Stuart Russell, Sergey Levine
ICLR 2021 Quantifying Differences in Reward Functions Adam Gleave, Michael D Dennis, Shane Legg, Stuart Russell, Jan Leike
ICLR 2020 Adversarial Policies: Attacking Deep Reinforcement Learning Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell
AAAI 2019 Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient Shihui Li, Yi Wu, Xinyue Cui, Honghua Dong, Fei Fang, Stuart Russell
ICML 2018 An Efficient, Generalized Bellman Update for Cooperative Inverse Reinforcement Learning Dhruv Malik, Malayandi Palaniappan, Jaime Fisac, Dylan Hadfield-Menell, Stuart Russell, Anca Dragan
ICML 2018 Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, Stuart Russell
NeurIPS 2018 Learning Plannable Representations with Causal InfoGAN Thanard Kurutach, Aviv Tamar, Ge Yang, Stuart Russell, Pieter Abbeel
NeurIPS 2018 Meta-Learning MCMC Proposals Tongzhou Wang, Yi Wu, Dave Moore, Stuart Russell
NeurIPS 2018 Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making Nishant Desai, Andrew Critch, Stuart Russell
AAAI 2017 A Nearly-Black-Box Online Algorithm for Joint Parameter and State Estimation in Temporal Models Yusuf Bugra Erol, Yi Wu, Lei Li, Stuart Russell
IJCAI 2017 Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions Aijun Bai, Stuart Russell
NeurIPS 2017 Inverse Reward Design Dylan Hadfield-Menell, Smitha Milli, Pieter Abbeel, Stuart Russell, Anca Dragan
IJCAI 2017 Should Robots Be Obedient? Smitha Milli, Dylan Hadfield-Menell, Anca D. Dragan, Stuart Russell
AISTATS 2017 Signal-Based Bayesian Seismic Monitoring David A. Moore, Stuart Russell
IJCAI 2017 The Off-Switch Game Dylan Hadfield-Menell, Anca D. Dragan, Pieter Abbeel, Stuart Russell
NeurIPS 2016 Cooperative Inverse Reinforcement Learning Dylan Hadfield-Menell, Stuart Russell, Pieter Abbeel, Anca Dragan
IJCAI 2016 Markovian State and Action Abstractions for MDPs via Hierarchical MCTS Aijun Bai, Siddharth Srivastava, Stuart Russell
AAAI 2016 Metaphysics of Planning Domain Descriptions Siddharth Srivastava, Stuart Russell, Alessandro Pinto
IJCAI 2016 Swift: Compiled Inference for Probabilistic Programming Languages Yi Wu, Lei Li, Stuart Russell, Rastislav Bodík
UAI 2015 A Smart-Dumb/Dumb-Smart Algorithm for Efficient Split-Merge MCMC Wei Wang, Stuart Russell
NeurIPS 2015 Gaussian Process Random Fields David Moore, Stuart Russell
UAI 2015 Multitasking: Optimal Planning for Bandit Superprocesses Dylan Hadfield-Menell, Stuart Russell
AAAI 2015 Tractability of Planning with Loops Siddharth Srivastava, Shlomo Zilberstein, Abhishek Gupta, Pieter Abbeel, Stuart Russell
NeurIPS 2014 Algorithm Selection by Rational Metareasoning as a Model of Human Strategy Selection Falk Lieder, Dillon Plunkett, Jessica B Hamrick, Stuart Russell, Nicholas Hay, Tom Griffiths
UAI 2014 Fast Gaussian Process Posteriors with Product Trees David A. Moore, Stuart Russell
UAI 2014 First-Order Open-Universe POMDPs Siddharth Srivastava, Stuart Russell, Paul Ruan, Xiang Cheng
AISTATS 2013 Dynamic Scaled Sampling for Deterministic Constraints Lei Li, Bharath Ramsundar, Stuart Russell
NeurIPS 2013 Multilinear Dynamical Systems for Tensor Time Series Mark Rogers, Lei Li, Stuart Russell
UAI 2013 Product Trees for Gaussian Process Covariance in Sublinear Time David A. Moore, Stuart Russell
UAI 2012 Selecting Computations: Theory and Applications Nicholas Hay, Stuart Russell, David Tolpin, Solomon Eyal Shimony
UAI 2011 A Temporally Abstracted Viterbi Algorithm Shaunak Chatterjee, Stuart Russell
IJCAI 2011 Bounded Intention Planning Jason Andrew Wolfe, Stuart Russell
AAAI 2011 Global Seismic Monitoring: A Bayesian Approach Nimar S. Arora, Stuart Russell, Paul Kidwell, Erik B. Sudderth
UAI 2010 Gibbs Sampling in Open-Universe Stochastic Languages Nimar S. Arora, Rodrigo de Salvo Braz, Erik B. Sudderth, Stuart Russell
NeurIPS 2010 Global Seismic Monitoring as Probabilistic Inference Nimar Arora, Stuart Russell, Paul Kidwell, Erik B. Sudderth
UAI 2010 RAPID: A Reachable Anytime Planner for Imprecisely-Sensed Domains Emma Brunskill, Stuart Russell
AISTATS 2010 Why Are DBNs Sparse? Shaunak Chatterjee, Stuart Russell
UAI 2008 Improving Gradient Estimation by Incorporating Sensor Data Gregory Lawrence, Stuart Russell
NeurIPS 2008 Probabilistic Detection of Short Events, with Application to Critical Care Monitoring Norm Aleks, Stuart Russell, Michael G. Madden, Diane Morabito, Kristan Staudenmayer, Mitchell Cohen, Geoffrey T. Manley
UAI 2006 A Compact, Hierarchical Q-Function Decomposition Bhaskara Marthi, Stuart Russell, David Andre
UAI 2006 General-Purpose MCMC Inference over Relational Structures Brian Milch, Stuart Russell
AISTATS 2005 Approximate Inference for Infinite Contingent Bayesian Networks Brian Milch, Bhaskara Marthi, David Sontag, Stuart Russell, Daniel L. Ong, Andrey Kolobov
IJCAI 2005 BLOG: Probabilistic Models with Unknown Objects Brian Milch, Bhaskara Marthi, Stuart Russell, David A. Sontag, Daniel L. Ong, Andrey Kolobov
IJCAI 2005 Concurrent Hierarchical Reinforcement Learning Bhaskara Marthi, Stuart Russell, David Latham, Carlos Guestrin
IJCAI 2005 Efficient Belief-State AND-OR Search, with Application to Kriegspiel Stuart Russell, Jason Andrew Wolfe
UAI 2003 A Generalized Mean Field Algorithm for Variational Inference in Exponential Families Eric P. Xing, Michael I. Jordan, Stuart Russell
UAI 2003 Efficient Gradient Estimation for Motor Control Learning Gregory Lawrence, Noah J. Cowan, Stuart Russell
IJCAI 2003 Logical Filtering Eyal Amir, Stuart Russell
ICML 2003 Q-Decomposition for Reinforcement Learning Agents Stuart Russell, Andrew Zimdars
NeurIPS 2002 A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell
UAI 2002 Decayed MCMC Filtering Bhaskara Marthi, Hanna Pasula, Stuart Russell, Yuval Peres
NeurIPS 2002 Distance Metric Learning with Application to Clustering with Side-Information Eric P. Xing, Michael I. Jordan, Stuart Russell, Andrew Y. Ng
NeurIPS 2002 Identity Uncertainty and Citation Matching Hanna Pasula, Bhaskara Marthi, Brian Milch, Stuart Russell, Ilya Shpitser
IJCAI 2001 Approximate Inference for First-Order Probabilistic Languages Hanna Pasula, Stuart Russell
UAI 2001 Variational MCMC Nando de Freitas, Pedro A. d. F. R. Højen-Sørensen, Stuart Russell
ICML 2000 Algorithms for Inverse Reinforcement Learning Andrew Y. Ng, Stuart Russell
UAI 2000 Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks Arnaud Doucet, Nando de Freitas, Kevin P. Murphy, Stuart Russell
IJCAI 1999 Convergence of Reinforcement Learning with General Function Approximators Vassilis A. Papavassiliou, Stuart Russell
ICML 1999 Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping Andrew Y. Ng, Daishi Harada, Stuart Russell
IJCAI 1999 Tracking Many Objects with Many Sensors Hanna Pasula, Stuart Russell, Michael Ostland, Yaacov Ritov
AAAI 1998 Bayesian Q-Learning Richard Dearden, Nir Friedman, Stuart Russell
COLT 1998 Learning Agents for Uncertain Environments (Extended Abstract) Stuart Russell
MLJ 1998 Learning from Examples and Membership Queries with Structured Determinations Prasad Tadepalli, Stuart Russell
UAI 1998 Learning the Structure of Dynamic Probabilistic Networks Nir Friedman, Kevin P. Murphy, Stuart Russell
AAAI 1998 Speech Recognition with Dynamic Bayesian Networks Geoffrey Zweig, Stuart Russell
MLJ 1997 Adaptive Probabilistic Networks with Hidden Variables John Binder, Daphne Koller, Stuart Russell, Keiji Kanazawa
IJCAI 1997 Challenge: What Is the Impact of Bayesian Networks on Learning? Nir Friedman, Moisés Goldszmidt, David Heckerman, Stuart Russell
UAI 1997 Image Segmentation in Video Sequences: A Probabilistic Approach Nir Friedman, Stuart Russell
IJCAI 1997 Object Identification in a Bayesian Context Timothy Huang, Stuart Russell
IJCAI 1997 Space-Efficient Inference in Dynamic Probabilistic Networks John Binder, Kevin P. Murphy, Stuart Russell
ECML-PKDD 1997 Uncertain Learning Agents (Abstract) Stuart Russell
IJCAI 1995 Approximating Optimal Policies for Partially Observable Stochastic Domains Ronald Parr, Stuart Russell
IJCAI 1995 Local Learning in Probabilistic Networks with Hidden Variables Stuart Russell, John Binder, Daphne Koller, Keiji Kanazawa
ICML 1995 Machine Learning, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, July 9-12, 1995 Armand Prieditis, Stuart Russell
IJCAI 1995 Rationality and Intelligence Stuart Russell
UAI 1995 Stochastic Simulation Algorithms for Dynamic Probabilistic Networks Keiji Kanazawa, Daphne Koller, Stuart Russell
IJCAI 1995 The BATmobile: Towards a Bayesian Automated Taxi Jeff Forbes, Timothy Huang, Keiji Kanazawa, Stuart Russell
AAAI 1994 Automatic Symbolic Traffic Scene Analysis Using Belief Networks Timothy Huang, Daphne Koller, Jitendra Malik, Gary H. Ogasawara, Bobby S. Rao, Stuart Russell, Joseph Weber
AAAI 1994 Control Strategies for a Stochastic Planner Jonathan Tash, Stuart Russell
IJCAI 1993 Anytime Sensing Planning and Action: A Practical Model for Robot Control Shlomo Zilberstein, Stuart Russell
ICML 1993 Decision Theoretic Subsampling for Induction on Large Databases Ron Musick, Jason Catlett, Stuart Russell
ECML-PKDD 1993 Learnability of Constrained Logic Programs Saso Dzeroski, Stephen H. Muggleton, Stuart Russell
IJCAI 1993 Planning Using Multiple Execution Architectures Gary H. Ogasawara, Stuart Russell
AAAI 1992 How Long Will It Take? Ron Musick, Stuart Russell
COLT 1992 PAC-Learnability of Determinate Logic Programs Saso Dzeroski, Stephen H. Muggleton, Stuart Russell
IJCAI 1989 On Optimal Game-Tree Search Using Rational Meta-Reasoning Stuart Russell, Eric Wefald
AAAI 1986 Preliminary Steps Toward the Automation of Induction Stuart Russell