ML Anthology
Authors
Search
About
Chen, Kaiqi
2 publications
ICLR
2025
DeFT: Decoding with Flash Tree-Attention for Efficient Tree-Structured LLM Inference
Jinwei Yao
,
Kaiqi Chen
,
Kexun Zhang
,
Jiaxuan You
,
Binhang Yuan
,
Zeke Wang
,
Tao Lin
ICLRW
2024
DeFT: Flash Tree-Attention with IO-Awareness for Efficient Tree-Search-Based LLM Inference
Jinwei Yao
,
Kexun Zhang
,
Kaiqi Chen
,
Jiaxuan You
,
Zeke Wang
,
Binhang Yuan
,
Tao Lin