Cao, Sheng

1 publications

ICLR 2025 Param$\Delta$ for Direct Mixing: Post-Train Large Language Model at Zero Cost Sheng Cao, Mingrui Wu, Karthik Prasad, Yuandong Tian, Zechun Liu