ML Anthology
Authors
Search
About
Hosmer, Basil
1 publications
ICML
2024
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
,
Bilge Acun
,
Basil Hosmer
,
Mostafa Elhoushi
,
Yejin Lee
,
Shivaram Venkataraman
,
Dimitris Papailiopoulos
,
Carole-Jean Wu