Caples, Diego
2 publications
NeurIPS
2025
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
Divyansh Garg, Diego Caples, Andis Draguns, Nikil Ravi, Pranav Putta, Naman Garg, Prannay Hebbar, Youngchul Joo, Jindong Gu, Charles London, Christian Schroeder de Witt, Sumeet Ramesh Motwani NeurIPS
2025
Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction
Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar