ML Anthology
Authors
Search
About
Brandon, William
2 publications
ICML
2025
Ladder-Residual: Parallelism-Aware Architecture for Accelerating Large Model Inference with Communication Overlapping
Muru Zhang
,
Mayank Mishra
,
Zhongzhu Zhou
,
William Brandon
,
Jue Wang
,
Yoon Kim
,
Jonathan Ragan-Kelley
,
Shuaiwen Leon Song
,
Ben Athiwaratkun
,
Tri Dao
NeurIPS
2024
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
William Brandon
,
Mayank Mishra
,
Aniruddha Nrusimha
,
Rameswar Panda
,
Jonathan Ragan-Kelley