Guo, Junxian

5 publications

ICLR 2025 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han
ICLR 2025 SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models Muyang Li, Yujun Lin, Zhekai Zhang, Tianle Cai, Xiuyu Li, Junxian Guo, Enze Xie, Chenlin Meng, Jun-Yan Zhu, Song Han
ICML 2025 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity Samir Khaki, Xiuyu Li, Junxian Guo, Ligeng Zhu, Konstantinos N. Plataniotis, Amir Yazdanbakhsh, Kurt Keutzer, Song Han, Zhijian Liu
ICCV 2025 SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference Samir Khaki, Junxian Guo, Jiaming Tang, Shang Yang, Yukang Chen, Konstantinos N. Plataniotis, Yao Lu, Song Han, Zhijian Liu
ICML 2025 XAttention: Block Sparse Attention with Antidiagonal Scoring Ruyi Xu, Guangxuan Xiao, Haofeng Huang, Junxian Guo, Song Han