Wasserblat, Moshe

3 publications

ICML 2025 Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Gaurav Jain, Oren Pereg, Moshe Wasserblat, David Harel
ICLR 2025 Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Oren Pereg, Moshe Wasserblat, Tomer Galanti, Michal Gordon-Kiwkowitz, David Harel
ICLR 2025 HELMET: How to Evaluate Long-Context Models Effectively and Thoroughly Howard Yen, Tianyu Gao, Minmin Hou, Ke Ding, Daniel Fleischer, Peter Izsak, Moshe Wasserblat, Danqi Chen