Zhang, Zhendong

1 publications

ICLRW 2025 Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention Zhendong Zhang