Jia, Zhihao

14 publications

ICLR 2025 MagicPIG: LSH Sampling for Efficient LLM Generation Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen
NeurIPS 2025 SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali
NeurIPS 2025 SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications Gabriele Oliaro, Zhihao Jia, Daniel F Campos, Aurick Qiao
ICLR 2025 TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia
ICML 2024 Accelerating Iterative Retrieval-Augmented Language Model Serving with Speculation Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia
NeurIPSW 2024 CAT Pruning: Cluster-Aware Token Pruning for Text-to-Image Diffusion Models Xinle Cheng, Zhuoming Chen, Zhihao Jia
NeurIPS 2024 Communication Bounds for the Distributed Experts Problem Zhihao Jia, Qi Pang, Trung Tran, David Woodruff, Zhihao Zhang, Wenting Zheng
NeurIPSW 2024 MagicPIG: LSH Sampling for Efficient LLM Generation Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen
NeurIPS 2024 Sequoia: Scalable and Robust Speculative Decoding Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen
NeurIPS 2024 SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin
IJCAI 2024 X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling Xupeng Miao, Shenhan Zhu, Fangcheng Fu, Ziyu Guo, Zhi Yang, Yaofeng Tu, Zhihao Jia, Bin Cui
NeurIPS 2022 BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs Kay Liu, Yingtong Dou, Yue Zhao, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H Chen, Zhihao Jia, Philip S Yu
ICLR 2022 GradSign: Model Performance Inference with Theoretical Insights Zhihao Zhang, Zhihao Jia
ICML 2018 Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken