Cai, Weilin

2 publications

ICLR 2026 Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts Shwai He, Weilin Cai, Jiayi Huang, Ang Li
ICML 2025 Shortcut-Connected Expert Parallelism for Accelerating Mixture of Experts Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang