ML Anthology
Authors
Search
About
Shao, Yakun Sophia
1 publications
NeurIPS
2024
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Coleman Hooper
,
Sehoon Kim
,
Hiva Mohammadzadeh
,
Michael W. Mahoney
,
Yakun Sophia Shao
,
Kurt Keutzer
,
Amir Gholami