Ye, Zhiling

2 publications

ICLR 2025 EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models Jialiang Cheng, Ning Gao, Yun Yue, Zhiling Ye, Jiadi Jiang, Jian Sha
NeurIPS 2023 AGD: An Auto-Switchable Optimizer Using Stepwise Gradient Difference for Preconditioning Matrix Yun Yue, Zhiling Ye, Jiadi Jiang, Yongchao Liu, Ke Zhang