Singh, Satinder

121 publications

NeurIPS 2025 Generating Creative Chess Puzzles Xidong Feng, Vivek Veeriah, Marcus Chiam, Michael D Dennis, Federico Barbero, Johan Obando-Ceron, Jiaxin Shi, Satinder Singh, Shaobo Hou, Nenad Tomasev, Tom Zahavy
ICML 2025 Mastering Board Games by External and Internal Planning with Language Models John Schultz, Jakub Adamek, Matej Jusup, Marc Lanctot, Michael Kaisers, Sarah Perrin, Daniel Hennes, Jeremy Shar, Cannada A. Lewis, Anian Ruoss, Tom Zahavy, Petar Veličković, Laurel Prince, Satinder Singh, Eric Malmi, Nenad Tomasev
NeurIPS 2025 Plasticity as the Mirror of Empowerment David Abel, Michael Bowling, Andre Barreto, Will Dabney, Shi Dong, Steven Stenberg Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh
ICML 2024 Genie: Generative Interactive Environments Jake Bruce, Michael D Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Maria Elisabeth Bechtle, Feryal Behbahani, Stephanie C.Y. Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando De Freitas, Satinder Singh, Tim Rocktäschel
ICLR 2023 Composing Task Knowledge with Modular Successor Feature Approximators Wilka Torrico Carvalho, Angelos Filos, Richard Lewis, Honglak Lee, Satinder Singh
ICLR 2023 Discovering Evolution Strategies via Meta-Black-Box Optimization Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dalibard, Chris Lu, Satinder Singh, Sebastian Flennerhag
ICLR 2023 Discovering Policies with DOMiNO: Diversity Optimization Maintaining near Optimality Tom Zahavy, Yannick Schroecker, Feryal Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh
ICML 2023 Human-Timescale Adaptation in an Open-Ended Task Space Jakob Bauer, Kate Baumli, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Satinder Singh, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei M Zhang
ICLR 2023 In-Context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, Dj Strouse, Steven Stenberg Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih
TMLR 2023 POMRL: No-Regret Learning-to-Plan with Increasing Horizons Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy
NeurIPSW 2023 POMRL: No-Regret Learning-to-Plan with Increasing Horizons Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy
ICML 2023 ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs Ted Moskovitz, Brendan O’Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy
ICMLW 2023 Structured State Space Models for In-Context Reinforcement Learning Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob Nicolaus Foerster, Satinder Singh, Feryal Behbahani
NeurIPSW 2023 Vision-Language Models as a Source of Rewards Kate Baumli, Satinder Singh, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Volodymyr Mnih, Alexander Neitz, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, Luyu Wang, Lei M Zhang
AAAI 2022 Adaptive Pairwise Weights for Temporal Credit Assignment Zeyu Zheng, Risto Vuorio, Richard L. Lewis, Satinder Singh
ICLR 2022 Bootstrapped Meta-Learning Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh
NeurIPSW 2022 Composing Task Knowledge with Modular Successor Feature Approximators Wilka Torrico Carvalho, Angelos Filos, Richard Lewis, Honglak Lee, Satinder Singh
NeurIPSW 2022 In-Context Policy Iteration Ethan Brooks, Logan A Walls, Richard Lewis, Satinder Singh
NeurIPSW 2022 In-Context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, Dj Strouse, Steven Stenberg Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih
NeurIPSW 2022 In-Context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, Dj Strouse, Steven Stenberg Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih
CoLLAs 2022 Meta-Gradients in Non-Stationary Environments Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh
ICLRW 2022 Meta-Gradients in Non-Stationary Environments Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh
IJCAI 2022 On the Expressivity of Markov Reward (Extended Abstract) David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh
NeurIPSW 2022 Optimistic Meta-Gradients Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh
NeurIPSW 2021 Bootstrapped Meta-Learning Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh
ICMLW 2021 Discovering Diverse Nearly Optimal Policies with Successor Features Tom Zahavy, Brendan O'Donoghue, Andre Barreto, Sebastian Flennerhag, Volodymyr Mnih, Satinder Singh
ICLR 2021 Discovering a Set of Policies for the Worst Case Reward Tom Zahavy, Andre Barreto, Daniel J Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh
AAAI 2021 Efficient Querying for Cooperative Probabilistic Commitments Qi Zhang, Edmund H. Durfee, Satinder Singh
NeurIPSW 2021 GrASP: Gradient-Based Affordance Selection for Planning Vivek Veeriah, Zeyu Zheng, Richard Lewis, Satinder Singh
IJCAI 2021 Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-Person Simulated 3D Environment Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh
ICML 2021 Reinforcement Learning of Implicit and Explicit Control Flow Instructions Ethan Brooks, Janarthanan Rajendran, Richard L Lewis, Satinder Singh
ICMLW 2021 Reward Is Enough for Convex MDPs Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh
ICLR 2020 Behaviour Suite for Reinforcement Learning Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt
AAAI 2020 How Should an Agent Practice? Janarthanan Rajendran, Richard L. Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh
AAAI 2020 Modeling Probabilistic Commitments for Maintenance Is Inherently Harder than for Achievement Qi Zhang, Edmund H. Durfee, Satinder Singh
AAAI 2020 Querying to Find a Safe Policy Under Uncertain Safety Constraints in Markov Decision Processes Shun Zhang, Edmund H. Durfee, Satinder Singh
AISTATS 2020 Sample Complexity of Reinforcement Learning Using Linearly Combined Model Ensembles Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh
ICML 2020 What Can Learned Intrinsic Rewards Capture? Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado Van Hasselt, David Silver, Satinder Singh
NeurIPS 2019 Discovery of Useful Questions as Auxiliary Tasks Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado P van Hasselt, David Silver, Satinder Singh
NeurIPS 2019 Hindsight Credit Assignment Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado P van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Remi Munos
AAAI 2019 Learning to Communicate and Solve Visual Blocks-World Tasks Qi Zhang, Richard L. Lewis, Satinder Singh, Edmund H. Durfee
NeurIPS 2019 No-Press Diplomacy: Modeling Multi-Agent Gameplay Philip Paquette, Yuchen Lu, Seton Steven Bocco, Max Smith, Satya O.-G., Jonathan K. Kummerfeld, Joelle Pineau, Satinder Singh, Aaron C. Courville
NeurIPS 2018 Completing State Representations Using Spectral Learning Nan Jiang, Alex Kulesza, Satinder Singh
ALT 2018 Markov Decision Processes with Continuous Side Information Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari
IJCAI 2018 Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes Shun Zhang, Edmund H. Durfee, Satinder Singh
NeurIPS 2018 On Learning Intrinsic Rewards for Policy Gradient Methods Zeyu Zheng, Junhyuk Oh, Satinder Singh
ICML 2018 Self-Imitation Learning Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee
ICLR 2017 Learning to Query, Reason, and Answer Questions on Ambiguous Texts Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerry Tesauro, Satinder Singh
AAAI 2017 Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA Satinder Singh, Shaul Markovitch
NeurIPS 2017 Repeated Inverse Reinforcement Learning Kareem Amin, Nan Jiang, Satinder Singh
NeurIPS 2017 Value Prediction Network Junhyuk Oh, Satinder Singh, Honglak Lee
ICML 2017 Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli
IJCAI 2016 Commitment Semantics for Sequential Decision Making Under Reward Uncertainty Qi Zhang, Edmund H. Durfee, Satinder Singh, Anna Chen, Stefan J. Witwicki
IJCAI 2016 Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games Xiaoxiao Guo, Satinder Singh, Richard L. Lewis, Honglak Lee
UAI 2016 Gradient Methods for Stackelberg Games Kareem Amin, Michael P. Wellman, Satinder Singh
AAAI 2016 Improving Predictive State Representations via Gradient Descent Nan Jiang, Alex Kulesza, Satinder Singh
MLJ 2016 Multi-Task Seizure Detection: Addressing Intra-Patient Variation in Seizure Morphologies Alexander Van Esbroeck, Landon Smith, Zeeshan Syed, Satinder Singh, Zahi N. Karam
IJCAI 2016 On Structural Properties of MDPs That Bound Loss Due to Shallow Planning Nan Jiang, Satinder Singh, Ambuj Tewari
IJCAI 2016 The Dependence of Effective Planning Horizon on Model Accuracy Nan Jiang, Alex Kulesza, Satinder Singh, Richard L. Lewis
ICML 2015 Abstraction Selection in Model-Based Reinforcement Learning Nan Jiang, Alex Kulesza, Satinder Singh
NeurIPS 2015 Action-Conditional Video Prediction Using Deep Networks in Atari Games Junhyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard L. Lewis, Satinder Singh
AISTATS 2015 Low-Rank Spectral Learning with Weighted Loss Functions Alex Kulesza, Nan Jiang, Satinder Singh
AAAI 2015 Spectral Learning of Predictive State Representations with Insufficient Statistics Alex Kulesza, Nan Jiang, Satinder Singh
AISTATS 2014 Characterizing EVOI-Sufficient K-Response Query Sets in Decision Problems Robert Cohn, Satinder Singh, Edmund H. Durfee
NeurIPS 2014 Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard L. Lewis, Xiaoshi Wang
AAAI 2014 Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization Alexander Van Esbroeck, Satinder Singh, Ilan Rubinfeld, Zeeshan Syed
AISTATS 2014 Low-Rank Spectral Learning Alex Kulesza, N. Raj Rao, Satinder Singh
AAAI 2014 Predicting Postoperative Atrial Fibrillation from Independent ECG Components Chih-Chun Chia, James Blum, Zahi N. Karam, Satinder Singh, Zeeshan Syed
NeurIPS 2013 Reward Mapping for Transfer in Long-Lived Agents Xiaoxiao Guo, Satinder Singh, Richard L. Lewis
AAAI 2012 Computing Stackelberg Equilibria in Discounted Stochastic Games Yevgeniy Vorobeychik, Satinder Singh
AAAI 2012 Security Games with Limited Surveillance Bo An, David Kempe, Christopher Kiekintveld, Eric Shieh, Satinder Singh, Milind Tambe, Yevgeniy Vorobeychik
AAAI 2011 Comparing Action-Query Strategies in Semi-Autonomous Agents Robert Cohn, Edmund H. Durfee, Satinder Singh
JAIR 2011 Learning to Make Predictions in Partially Observable Environments Without a Generative Model Erik Talvitie, Satinder Singh
AAAI 2011 Optimal Rewards Versus Leaf-Evaluation Heuristics in Planning Agents Jonathan Sorg, Satinder Singh, Richard L. Lewis
ICML 2010 Internal Rewards Mitigate Agent Boundedness Jonathan Sorg, Satinder Singh, Richard L. Lewis
UAI 2010 Variance-Based Rewards for Approximate Bayesian Reinforcement Learning Jonathan Sorg, Satinder Singh, Richard L. Lewis
IJCAI 2009 Learning Graphical Game Models Quang Duong, Yevgeniy Vorobeychik, Satinder Singh, Michael P. Wellman
IJCAI 2009 Maintaining Predictions over Time Without a Model Erik Talvitie, Satinder Singh
ICML 2008 Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State David Wingate, Satinder Singh
UAI 2008 Knowledge Combination in Graphical Multiagent Models Quang Duong, Michael P. Wellman, Satinder Singh
AAAI 2007 Abstraction in Predictive State Representations Vishal Soni, Satinder Singh
IJCAI 2007 An Experts Algorithm for Transfer Learning Erik Talvitie, Satinder Singh
AAAI 2007 Enabling Domain-Awareness for a Generic Natural Language Interface Yunyao Li, Ishan Chaudhuri, Huahai Yang, Satinder Singh, H. V. Jagadish
MLJ 2007 Learning Payoff Functions in Infinite Games Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh
IJCAI 2007 Relational Knowledge with Predictive State Representations David Wingate, Vishal Soni, Britton Wolfe, Satinder Singh
ICML 2006 Kernel Predictive Linear Gaussian Models for Nonlinear Stochastic Dynamical Systems David Wingate, Satinder Singh
AAAI 2006 Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems David Wingate, Satinder Singh
UAI 2006 Optimal Coordinated Planning Amongst Self-Interested Agents with Private State Ruggiero Cavallo, David C. Parkes, Satinder Singh
ICML 2006 Predictive Linear-Gaussian Models of Controlled Stochastic Dynamical Systems Matthew R. Rudary, Satinder Singh
ICML 2006 Predictive State Representations with Options Britton Wolfe, Satinder Singh
AAAI 2006 Using Homomorphisms to Transfer Options Across Continuous Reinforcement Learning Domains Vishal Soni, Satinder Singh
IJCAI 2005 Combining Memory and Landmarks with Predictive State Representations Michael R. James, Britton Wolfe, Satinder Singh
IJCAI 2005 Learning Payoff Functions in Infinite Games Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh
ICML 2005 Learning Predictive State Representations in Dynamical Systems Without Reset Britton Wolfe, Michael R. James, Satinder Singh
AAAI 2005 Planning in Models That Combine Memory with Predictive Representations of State Michael R. James, Satinder Singh
UAI 2005 Predictive Linear-Gaussian Models of Stochastic Dynamical Systems Matthew R. Rudary, Satinder Singh, David Wingate
ICML 2004 Adaptive Cognitive Orthotics: Combining Reinforcement Learning and Constraint-Based Temporal Reasoning Matthew R. Rudary, Satinder Singh, Martha E. Pollack
ICML 2004 Learning and Discovery of Predictive State Representations in Dynamical Systems with Reset Michael R. James, Satinder Singh
UAI 2004 Predictive State Representations: A New Theory for Modeling Dynamical Systems Satinder Singh, Michael R. James, Matthew R. Rudary
ICML 2003 Learning Predictive State Representations Satinder Singh, Michael L. Littman, Nicholas K. Jong, David Pardoe, Peter Stone
AAAI 2002 CobotDS: A Spoken Dialogue System for Chat Michael J. Kearns, Charles Lee Isbell Jr., Satinder Singh, Diane J. Litman, Jessica Howe
MLJ 2002 Introduction Satinder Singh
MLJ 2002 Near-Optimal Reinforcement Learning in Polynomial Time Michael J. Kearns, Satinder Singh
JAIR 2002 Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System Satinder Singh, Diane J. Litman, Michael J. Kearns, Marilyn A. Walker
JAIR 2001 ATTac-2000: An Adaptive Autonomous Bidding Agent Peter Stone, Michael L. Littman, Satinder Singh, Michael J. Kearns
UAI 2001 Graphical Models for Game Theory Michael J. Kearns, Michael L. Littman, Satinder Singh
ICML 2000 A Boosting Approach to Topic Spotting on Subdialogues Kary L. Myers, Michael J. Kearns, Satinder Singh, Marilyn A. Walker
COLT 2000 Bias-Variance Error Bounds for Temporal Difference Updates Michael J. Kearns, Satinder Singh
AAAI 2000 Cobot in LambdaMOO: A Social Statistics Agent Charles Lee Isbell Jr., Michael J. Kearns, David P. Kormann, Satinder Singh, Peter Stone
MLJ 2000 Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms Satinder Singh, Tommi S. Jaakkola, Michael L. Littman, Csaba Szepesvári
ICML 2000 Eligibility Traces for Off-Policy Policy Evaluation Doina Precup, Richard S. Sutton, Satinder Singh
AAAI 2000 Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System Satinder Singh, Michael J. Kearns, Diane J. Litman, Marilyn A. Walker
UAI 2000 Fast Planning in Stochastic Games Michael J. Kearns, Yishay Mansour, Satinder Singh
UAI 2000 Nash Convergence of Gradient Dynamics in General-Sum Games Satinder Singh, Michael J. Kearns, Yishay Mansour
UAI 1999 Approximate Planning for Factored POMDPs Using Belief State Simplification David A. McAllester, Satinder Singh
UAI 1999 On the Complexity of Policy Iteration Yishay Mansour, Satinder Singh
MLJ 1998 Analytical Mean Squared Error Curves for Temporal Difference Learning Satinder Singh, Peter Dayan
ICML 1998 Intra-Option Learning About Temporally Abstract Actions Richard S. Sutton, Doina Precup, Satinder Singh
ICML 1998 Near-Optimal Reinforcement Learning in Polynominal Time Michael J. Kearns, Satinder Singh
ECML-PKDD 1998 Theoretical Results on Reinforcement Learning with Temporally Abstract Options Doina Precup, Richard S. Sutton, Satinder Singh
ICML 1998 Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes John Loch, Satinder Singh