Motwani, Sumeet Ramesh

7 publications

ICLRW 2025 MALT: Improving Reasoning with Multi-Agent LLM Training Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, Rafael Rafailov, Ivan Laptev, Philip Torr, Fabio Pizzati, Ronald Clark, Christian Schroeder de Witt
NeurIPS 2025 REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites Divyansh Garg, Diego Caples, Andis Draguns, Nikil Ravi, Pranav Putta, Naman Garg, Prannay Hebbar, Youngchul Joo, Jindong Gu, Charles London, Christian Schroeder de Witt, Sumeet Ramesh Motwani
TMLR 2024 Foundational Challenges in Assuring Alignment and Safety of Large Language Models Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric J Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Chenyu Zhang, Ruiqi Zhong, Sean O hEigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Aleksandar Petrov, Christian Schroeder de Witt, Sumeet Ramesh Motwani, Yoshua Bengio, Danqi Chen, Philip Torr, Samuel Albanie, Tegan Maharaj, Jakob Nicolaus Foerster, Florian Tramèr, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger
ICLR 2024 STARC: A General Framework for Quantifying Differences Between Reward Functions Joar Max Viktor Skalse, Lucy Farnik, Sumeet Ramesh Motwani, Erik Jenner, Adam Gleave, Alessandro Abate
NeurIPS 2024 Secret Collusion Among AI Agents: Multi-Agent Deception via Steganography Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H.S. Torr, Lewis Hammond, Christian Schroeder de Witt
NeurIPS 2024 Unelicitable Backdoors via Cryptographic Transformer Circuits Andis Draguns, Andrew Gritsevskiy, Sumeet Ramesh Motwani, Christian Schroeder de Witt
NeurIPSW 2023 A Perfect Collusion Benchmark: How Can AI Agents Be Prevented from Colluding with Information-Theoretic Undetectability? Sumeet Ramesh Motwani, Mikhail Baranchuk, Lewis Hammond, Christian Schroeder de Witt