Wang, Haoxu

2 publications

ICLR 2026 SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention Jintao Zhang, Haoxu Wang, Kai Jiang, Shuo Yang, Kaiwen Zheng, Haocheng Xi, Ziteng Wang, Hongzhou Zhu, Min Zhao, Ion Stoica, Joseph E. Gonzalez, Jianfei Chen, Jun Zhu
NeurIPS 2025 SageAttention3: Microscaling FP4 Attention for Inference and an Exploration of 8-Bit Training Jintao Zhang, Jia Wei, Haoxu Wang, Pengle Zhang, Xiaoming Xu, Haofeng Huang, Kai Jiang, Jianfei Chen, Jun Zhu