Qin, Yanyuan

1 publications

NeurIPSW 2024 MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference Zhenyu Liu, Zhemin Zhang, Zirui Zhang, Yanyuan Qin, Jiayi Luo, Zhenyu Gu, Liu Liu