Gui, Sheng

2 publications

ICLRW 2024 Distributed Inference Performance Optimization for LLMs on CPUs Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui
ICMLW 2024 Inference Performance Optimization for Large Language Models on CPUs Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie