ML Anthology
Authors
Search
About
Klimovic, Ana
3 publications
ICLRW
2025
DeltaMoE: Memory-Efficient Inference for Merged Mixture of Experts with Delta Compression
Boyko Borisov
,
Xiaozhe Yao
,
Nezihe Merve Gürel
,
Ana Klimovic
ICML
2025
Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs
Youhe Jiang
,
Fangcheng Fu
,
Xiaozhe Yao
,
Guoliang He
,
Xupeng Miao
,
Ana Klimovic
,
Bin Cui
,
Binhang Yuan
,
Eiko Yoneki
ICML
2024
DéjàVu: KV-Cache Streaming for Fast, Fault-Tolerant Generative LLM Serving
Foteini Strati
,
Sara Mcallister
,
Amar Phanishayee
,
Jakub Tarnawski
,
Ana Klimovic