ML Anthology
Authors
Search
About
Shahout, Rana
5 publications
ICLR
2025
Don’T Stop Me Now: Embedding Based Scheduling for LLMs
Rana Shahout
,
Eran Malach
,
Chunwei Liu
,
Weifan Jiang
,
Minlan Yu
,
Michael Mitzenmacher
NeurIPS
2025
Fast Inference for Augmented Large Language Models
Rana Shahout
,
Cong Liang
,
Shiji Xin
,
Qianru Lao
,
Yong Cui
,
Minlan Yu
,
Michael Mitzenmacher
ICLRW
2025
Faster, Cheaper, Just as Good: Cost- and Latency-Constrained Routing for LLMs
Javid Lakha
,
Minlan Yu
,
Rana Shahout
ICLRW
2025
Prefix and Output Length-Aware Scheduling for Efficient Online LLM Inference
Iñaki Arango
,
Ayush Noori
,
Yepeng Huang
,
Rana Shahout
,
Minlan Yu
NeurIPS
2024
SkipPredict: When to Invest in Predictions for Scheduling
Rana Shahout
,
Michael Mitzenmacher