Li, Zonghang

2 publications

ICLR 2026 Prima.cpp: Fast 30-70b LLM Inference on Heterogeneous and Low-Resource Home Clusters Zonghang Li, Tao Li, Wenjiao Feng, Rongxing Xiao, Jianshu She, Hong Huang, Mohsen Guizani, Hongfang Yu, Qirong Ho, Wei Xiang, Xue Liu
ICLR 2026 Tequila: Trapping-Free Ternary Quantization for Large Language Models Hong Huang, Decheng Wu, Rui Cen, Guanghua Yu, Zonghang Li, Kai Liu, Jianchen Zhu, Peng Chen, Xue Liu, Dapeng Wu