Gu, Zhenyu

3 publications

NeurIPSW 2024 MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference Zhenyu Liu, Zhemin Zhang, Zirui Zhang, Yanyuan Qin, Jiayi Luo, Zhenyu Gu, Liu Liu
AAAI 2021 Distribution Adaptive INT8 Quantization for Training CNNs Kang Zhao, Sida Huang, Pan Pan, Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu
ICML 2020 Boosting Deep Neural Network Efficiency with Dual-Module Inference Liu Liu, Lei Deng, Zhaodong Chen, Yuke Wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie