Zuo, Pengfei

2 publications

ICLR 2026 DualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving Ying Yuan, Pengfei Zuo, Bo Wang, Zhangyu Chen, Zhipeng Tan, Zhou Yu
AAAI 2025 AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference Zhuomin He, Yizhen Yao, Pengfei Zuo, Bin Gao, Qinya Li, Zhenzhe Zheng, Fan Wu