ML Anthology
Authors
Search
About
Yao, Yiwu
5 publications
NeurIPS
2025
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
Yuantian Shao
,
Yuanteng Chen
,
Peisong Wang
,
Jianlin Yu
,
Jing Lin
,
Yiwu Yao
,
Zhihui Wei
,
Jian Cheng
ICLR
2025
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Weizhong Huang
,
Yuxin Zhang
,
Xiawu Zheng
,
Liuyang
,
Jing Lin
,
Yiwu Yao
,
Rongrong Ji
ICML
2025
KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
Xing Li
,
Zeyu Xing
,
Yiming Li
,
Linping Qu
,
Hui-Ling Zhen
,
Yiwu Yao
,
Wulong Liu
,
Sinno Jialin Pan
,
Mingxuan Yuan
ICLR
2025
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
Hanlin Tang
,
Yang Lin
,
Jing Lin
,
Qingsen Han
,
Danning Ke
,
Shikuan Hong
,
Yiwu Yao
,
Gongyi Wang
ICLR
2024
Dynamic Sparse No Training: Training-Free Fine-Tuning for Sparse LLMs
Yuxin Zhang
,
Lirui Zhao
,
Mingbao Lin
,
Sun Yunyun
,
Yiwu Yao
,
Xingjia Han
,
Jared Tanner
,
Shiwei Liu
,
Rongrong Ji