Wei, Jia
9 publications
ICML
2025
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-Thread INT4 Quantization
NeurIPS
2025
SageAttention3: Microscaling FP4 Attention for Inference and an Exploration of 8-Bit Training
9 publications