Venkataraman, Shivaram

6 publications

ICML 2025 LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models Tzu-Tao Chang, Shivaram Venkataraman
ICML 2025 Scaling Inference-Efficient Language Models Song Bian, Minghao Yan, Shivaram Venkataraman
ICML 2024 CHAI: Clustered Head Attention for Efficient LLM Inference Saurabh Agarwal, Bilge Acun, Basil Hosmer, Mostafa Elhoushi, Yejin Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu
ICMLW 2024 CO2: Precise Attention Score Observation for Improving KV Cache Replacement in Large Language Model Meguru Yamazaki, Shivaram Venkataraman
ICML 2017 Breaking Locality Accelerates Block Gauss-Seidel Stephen Tu, Shivaram Venkataraman, Ashia C. Wilson, Alex Gittens, Michael I. Jordan, Benjamin Recht
MLOSS 2016 MLlib: Machine Learning in Apache Spark Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, Db Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar