Ro, Yeonju

5 publications

ICML 2025 On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention for Long-Context LLM Serving Yeonju Ro, Zhenyu Zhang, Souvik Kundu, Zhangyang Wang, Aditya Akella
NeurIPS 2024 $\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design Ruisi Cai, Yeonju Ro, Geon-Woo Kim, Peihao Wang, Babak Ehteshami Bejnordi, Aditya Akella, Zhangyang Wang
CVPRW 2023 Dataset Efficient Training with Model Ensembling Yeonju Ro, Cong Xu, Agnieszka Ciborowska, Suparna Bhattacharya, Frankie Li, Martin Foltin
ICML 2023 Lowering the Pre-Training Tax for Gradient-Based Subset Training: A Lightweight Distributed Pre-Training Toolkit Yeonju Ro, Zhangyang Wang, Vijay Chidambaram, Aditya Akella
CVPR 2022 Mr.BiQ: Post-Training Non-Uniform Quantization Based on Minimizing the Reconstruction Error Yongkweon Jeon, Chungman Lee, Eulrang Cho, Yeonju Ro