ML Anthology
Authors
Search
About
Jin, Yunho
2 publications
NeurIPS
2023
$s^3$: Increasing GPU Utilization During Generative Inference for Higher Throughput
Yunho Jin
,
Chun-Feng Wu
,
David Brooks
,
Gu-Yeon Wei
ICMLW
2023
SpeedLimit: Neural Architecture Search for Quantized Transformer Models
Yuji Chai
,
Luke Bailey
,
Yunho Jin
,
Glenn Ko
,
Matthew Karle
,
David Brooks
,
Gu-Yeon Wei
,
H. Kung