Zhou, Shuyan

12 publications

ICLR 2025 Aligned LLMs Are Not Aligned Browser Agents Priyanshu Kumar, Elaine Lau, Saranya Vijayakumar, Tu Trinh, Elaine T Chang, Vaughn Robinson, Shuyan Zhou, Matt Fredrikson, Sean M. Hendryx, Summer Yue, Zifan Wang
NeurIPS 2025 TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Frank F. Xu, Yufan Song, Boxuan Li, Yuxuan Tang, Kritanjali Jain, Mengxue Bao, Zora Zhiruo Wang, Xuhui Zhou, Zhitong Guo, Murong Cao, Mingyang Yang, Hao Yang Lu, Amaad Martin, Zhe Su, Leander Melroy Maben, Raj Mehta, Wayne Chi, Lawrence Keunho Jang, Yiqing Xie, Shuyan Zhou, Graham Neubig
NeurIPS 2024 OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu
NeurIPS 2024 Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale Tianyue Ou, Frank F. Xu, Aman Madaan, Jiarui Liu, Robert Lo, Abishek Sridhar, Sudipta Sengupta, Dan Roth, Graham Neubig, Shuyan Zhou
ICLRW 2024 VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried
ICLR 2024 WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
ICMLW 2024 WebCanvas: Benchmarking Web Agents in Online Environments Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou, Tongshuang Wu, Zhengyang Wu
ICMLW 2024 WebCanvas: Benchmarking Web Agents in Online Environments Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou, Tongshuang Wu, Zhengyang Wu
ICLR 2023 DocPrompting: Generating Code by Retrieving the Docs Shuyan Zhou, Uri Alon, Frank F. Xu, Zhengbao Jiang, Graham Neubig
ICML 2023 PAL: Program-Aided Language Models Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig
NeurIPSW 2023 WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
NeurIPSW 2023 WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig