Carroll, Micah

13 publications

ICLRW 2025 CTRL-Rec: Controlling Recommender Systems with Natural Language Micah Carroll, Adeline Foote, Marcus Williams, Anca Dragan, W. Bradley Knox, Smitha Milli
ICLR 2025 On Targeted Manipulation and Deception When Optimizing LLMs for User Feedback Marcus Williams, Micah Carroll, Adhyyan Narang, Constantin Weisser, Brendan Murphy, Anca Dragan
NeurIPS 2025 Robust and Diverse Multi-Agent Learning via Rational Policy Gradient Niklas Lauffer, Ameesh Shah, Micah Carroll, Sanjit A. Seshia, Stuart Russell, Michael D Dennis
ICML 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICLRW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICMLW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
ICMLW 2024 AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan
NeurIPSW 2024 Targeted Manipulation and Deception Emerge in LLMs Trained on User* Feedback Marcus Williams, Micah Carroll, Constantin Weisser, Brendan Murphy, Adhyyan Narang, Anca Dragan
TMLR 2023 Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomek Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphael Segerie, Micah Carroll, Andi Peng, Phillip J.K. Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell
ICML 2023 Who Needs to Know? Minimal Knowledge for Optimal Coordination Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael D Dennis, Stuart Russell
ICLRW 2022 Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin
NeurIPS 2022 Uni[MASK]: Unified Inference in Sequential Decision Problems Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin
NeurIPS 2019 On the Utility of Learning About Humans for Human-AI Coordination Micah Carroll, Rohin Shah, Mark K Ho, Tom Griffiths, Sanjit Seshia, Pieter Abbeel, Anca Dragan