ML Anthology
Authors
Search
About
Zuo, Pengfei
2 publications
ICLR
2026
DualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving
Ying Yuan
,
Pengfei Zuo
,
Bo Wang
,
Zhangyu Chen
,
Zhipeng Tan
,
Zhou Yu
AAAI
2025
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Zhuomin He
,
Yizhen Yao
,
Pengfei Zuo
,
Bin Gao
,
Qinya Li
,
Zhenzhe Zheng
,
Fan Wu