He, Zifan

2 publications

AAAI 2025 Dynamic-Width Speculative Beam Decoding for LLM Inference Zongyue Qin, Zifan He, Neha Prakriya, Jason Cong, Yizhou Sun
ICLR 2025 Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference Zongyue Qin, Ziniu Hu, Zifan He, Neha Prakriya, Jason Cong, Yizhou Sun