Chan, Lawrence

13 publications

NeurIPS 2025 Measuring AI Ability to Complete Long Software Tasks Thomas Kwa, Ben West, Joel Becker, Amy Deng, Katharyn Garcia, Max Hasin, Sami Jawhar, Megan Kinniment, Nate Rush, Sydney Von Arx, Ryan Bloom, Thomas Broadley, Haoxing Du, Brian Goodrich, Nikola Jurkovic, Luke Harold Miles, Seraphina Nix, Tao Roa Lin, Neev Parikh, David Rein, Lucas Jun Koba Sato, Hjalmar Wijk, Daniel M Ziegler, Elizabeth Barnes, Lawrence Chan
ICML 2025 RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents Against Human Experts Hjalmar Wijk, Tao Roa Lin, Joel Becker, Sami Jawhar, Neev Parikh, Thomas Broadley, Lawrence Chan, Michael Chen, Joshua M Clymer, Jai Dhyani, Elena Ericheva, Katharyn Garcia, Brian Goodrich, Nikola Jurkovic, Megan Kinniment, Aron Lajko, Seraphina Nix, Lucas Jun Koba Sato, William Saunders, Maksym Taran, Ben West, Elizabeth Barnes
NeurIPS 2024 Compact Proofs of Model Performance via Mechanistic Interpretability Jason Gross, Rajashree Agrawal, Thomas Kwa, Euan Ong, Chun Hei Yip, Alex Gibson, Soufiane Noubir, Lawrence Chan
ICMLW 2024 Compact Proofs of Model Performance via Mechanistic Interpretability Jason Gross, Rajashree Agrawal, Thomas Kwa, Euan Ong, Chun Hei Yip, Alex Gibson, Soufiane Noubir, Lawrence Chan
TMLR 2024 Language Models Are Better than Humans at Next-Token Prediction Buck Shlegeris, Fabien Roger, Lawrence Chan, Euan McLean
ICMLW 2024 Mathematical Models of Computation in Superposition Kaarel Hänni, Jake Mendel, Dmitry Vaintrob, Lawrence Chan
ICLR 2024 The Alignment Problem from a Deep Learning Perspective Richard Ngo, Lawrence Chan, Sören Mindermann
ICML 2023 A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations Bilal Chughtai, Lawrence Chan, Neel Nanda
ICLRW 2023 Neural Networks Learn Representation Theory: Reverse Engineering How Networks Perform Group Operations Bilal Chughtai, Lawrence Chan, Neel Nanda
ICLR 2023 Progress Measures for Grokking via Mechanistic Interpretability Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt
NeurIPS 2022 Adversarial Training for High-Stakes Reliability Daniel Ziegler, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis, Noa Nabeshima, Benjamin Weinstein-Raun, Daniel de Haas, Buck Shlegeris, Nate Thomas
L4DC 2021 Optimal Cost Design for Model Predictive Control Avik Jain, Lawrence Chan, Daniel S. Brown, Anca D. Dragan
NeurIPSW 2020 Accounting for Human Learning When Inferring Human Preferences Harry Giles, Lawrence Chan