Zaharia, Matei

38 publications

DMLR 2025 Data Acquisition: A New Frontier in Data-Centric AI Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou
ICLR 2025 ElasticTok: Adaptive Tokenization for Image and Video Wilson Yan, Volodymyr Mnih, Aleksandra Faust, Matei Zaharia, Pieter Abbeel, Hao Liu
NeurIPS 2025 Establishing Best Practices in Building Rigorous Agentic Benchmarks Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun, Andy K Zhang, Shu Liu, Sasha Cui, Sayash Kapoor, Shayne Longpre, Kevin Meng, Rebecca Weiss, Fazl Barez, Rahul Gupta, Jwala Dhamala, Jacob Merizian, Mario Giulianelli, Harry Coppock, Cozmin Ududec, Antony Kellermann, Jasjeet S Sekhon, Jacob Steinhardt, Sarah Schwettmann, Arvind Narayanan, Matei Zaharia, Ion Stoica, Percy Liang, Daniel Kang
ICML 2025 HashAttention: Semantic Sparsity for Faster Inference Aditya Desai, Shuo Yang, Alejandro Cuadron, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica
ICLRW 2025 Learning Automata from Demonstrations, Examples, and Natural Language Marcell Vazquez-Chanlatte, Karim Elmaaroufi, Stefan Witwicki, Matei Zaharia, Sanjit A. Seshia
NeurIPS 2025 Why Do Multi-Agent LLM Systems Fail? Mert Cemri, Melissa Z Pan, Shuyi Yang, Lakshya A Agrawal, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica
ICLRW 2025 Why Do Multiagent Systems Fail? Melissa Z Pan, Mert Cemri, Lakshya A Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, Joseph E. Gonzalez, Matei Zaharia, Ion Stoica
ICLR 2025 World Model on Million-Length Video and Language with Blockwise RingAttention Hao Liu, Wilson Yan, Matei Zaharia, Pieter Abbeel
NeurIPS 2024 Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems Lingjiao Chen, Jared Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou
ICLR 2024 DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan A, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, Christopher Potts
TMLR 2024 FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance Lingjiao Chen, Matei Zaharia, James Zou
NeurIPSW 2024 Long Context RAG Performance of Large Language Models Quinn Leng, Jacob Portes, Sam Havens, Matei Zaharia, Michael Carbin
ICLR 2024 RingAttention with Blockwise Transformers for Near-Infinite Context Hao Liu, Matei Zaharia, Pieter Abbeel
NeurIPSW 2023 Analyzing ChatGPT’s Behavior Shifts over Time Lingjiao Chen, Matei Zaharia, James Zou
NeurIPSW 2023 DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan A, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, Christopher Potts
ICMLW 2023 Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks Daniel Kang, Xuechen Li, Ion Stoica, Carlos Guestrin, Matei Zaharia, Tatsunori Hashimoto
NeurIPSW 2023 Exploration with Principles for Diverse AI Supervision Hao Liu, Matei Zaharia, Pieter Abbeel
NeurIPSW 2023 Exploration with Principles for Diverse AI Supervision Hao Liu, Matei Zaharia, Pieter Abbeel
NeurIPSW 2023 Exploration with Principles for Diverse AI Supervision Hao Liu, Matei Zaharia, Pieter Abbeel
AAAI 2023 HAPI Explorer: Comprehension, Discovery, and Explanation on History of ML APIs Lingjiao Chen, Zhihua Jin, Sabri Eyuboglu, Huamin Qu, Christopher Ré, Matei Zaharia, James Zou
ICMLW 2023 Implementing Block-Sparse Matrix Multiplication Kernels Using Triton Priya Mishra, Trevor Gale, Matei Zaharia, Cliff Young, Deepak Narayanan
ICMLW 2023 Less Is More: Using Multiple LLMs for Applications with Lower Costs Lingjiao Chen, Matei Zaharia, James Zou
NeurIPSW 2023 Ring Attention with Blockwise Transformers for Near-Infinite Context Hao Liu, Matei Zaharia, Pieter Abbeel
NeurIPSW 2023 Ring Attention with Blockwise Transformers for Near-Infinite Context Hao Liu, Matei Zaharia, Pieter Abbeel
ICML 2022 Efficient Online ML API Selection for Multi-Label Classification Tasks Lingjiao Chen, Matei Zaharia, James Zou
NeurIPS 2022 Estimating and Explaining Model Performance When Both Covariates and Labels Shift Lingjiao Chen, Matei Zaharia, James Y Zou
NeurIPS 2022 HAPI: A Large-Scale Longitudinal Dataset of Commercial ML API Predictions Lingjiao Chen, Zhihua Jin, Evan Sabri Eyuboglu, Christopher Ré, Matei Zaharia, James Y Zou
ICLR 2022 Hindsight: Posterior-Guided Training of Retrievers for Improved Open-Ended Generation Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D Manning
ICLR 2022 How Did the Model Change? Efficiently Assessing Machine Learning API Shifts Lingjiao Chen, Matei Zaharia, James Zou
NeurIPSW 2022 Is Unsupervised Performance Estimation Impossible When Both Covariates and Labels Shift? Lingjiao Chen, Matei Zaharia, James Y. Zou
AAAI 2022 Similarity Search for Efficient Active Learning and Search of Rare Concepts Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert D. Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz
NeurIPS 2021 Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval Omar Khattab, Christopher Potts, Matei Zaharia
ICML 2021 Memory-Efficient Pipeline-Parallel DNN Training Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, Matei Zaharia
NeurIPS 2020 FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply Lingjiao Chen, Matei Zaharia, James Y Zou
ICLR 2020 Selection via Proxy: Efficient Data Selection for Deep Learning Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
ICML 2019 LIT: Learned Intermediate Representation Training for Model Compression Animesh Koratana, Daniel Kang, Peter Bailis, Matei Zaharia
MLOSS 2016 MLlib: Machine Learning in Apache Spark Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, Db Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar
NeurIPS 2016 Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale Firas Abuzaid, Joseph K. Bradley, Feynman T Liang, Andrew Feng, Lee Yang, Matei Zaharia, Ameet S Talwalkar