Ribar, Luka

3 publications

NeurIPSW 2024 Approximate Top-K for Increased Parallelism Oscar Key, Luka Ribar, Alberto Cattaneo, Luke Hudlass-Galley, Douglas Orr
ICML 2024 SparQ Attention: Bandwidth-Efficient LLM Inference Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr
ICLRW 2024 SparQ Attention: Bandwidth-Efficient LLM Inference Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr