ML Anthology
Authors
Search
About
Nrusimha, Aniruddha
1 publications
NeurIPS
2024
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
William Brandon
,
Mayank Mishra
,
Aniruddha Nrusimha
,
Rameswar Panda
,
Jonathan Ragan-Kelley