ML Anthology
Authors
Search
About
Dong, Harry
4 publications
ICML
2025
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Hanshi Sun
,
Li-Wen Chang
,
Wenlei Bao
,
Size Zheng
,
Ningxin Zheng
,
Xin Liu
,
Harry Dong
,
Yuejie Chi
,
Beidi Chen
ICML
2024
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Harry Dong
,
Xinyu Yang
,
Zhenyu Zhang
,
Zhangyang Wang
,
Yuejie Chi
,
Beidi Chen
ICMLW
2024
Prompt-Prompted Adaptive Structured Pruning for Efficient LLM Generation
Harry Dong
,
Beidi Chen
,
Yuejie Chi
ICMLW
2023
Towards Structured Sparsity in Transformers for Efficient Inference
Harry Dong
,
Beidi Chen
,
Yuejie Chi