ML Anthology
Authors
Search
About
Zheng, Size
3 publications
ICML
2025
MxMoE: Mixed-Precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
,
Xiuhong Li
,
Zhihang Yuan
,
Size Zheng
,
Jiangfei Duan
,
Xingcheng Zhang
,
Dahua Lin
ICML
2025
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Hanshi Sun
,
Li-Wen Chang
,
Wenlei Bao
,
Size Zheng
,
Ningxin Zheng
,
Xin Liu
,
Harry Dong
,
Yuejie Chi
,
Beidi Chen
NeurIPS
2024
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction
Renze Chen
,
Zhuofeng Wang
,
Beiquan Cao
,
Tong Wu
,
Size Zheng
,
Xiuhong Li
,
Xuechao Wei
,
Shengen Yan
,
Meng Li
,
Yun Liang