ML Anthology
Authors
Search
About
Qu, Linping
1 publications
ICML
2025
KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
Xing Li
,
Zeyu Xing
,
Yiming Li
,
Linping Qu
,
Hui-Ling Zhen
,
Yiwu Yao
,
Wulong Liu
,
Sinno Jialin Pan
,
Mingxuan Yuan