Wan, Haiyuan

1 publications

NeurIPS 2025 Spotlight Attention: Towards Efficient LLM Generation via Non-Linear Hashing-Based KV Cache Retrieval Wenhao Li, Yuxin Zhang, Gen Luo, Haiyuan Wan, ZiYang Gong, Fei Chao, Rongrong Ji