ML Anthology
Authors
Search
About
Ye, Jiaquan
2 publications
ICLR
2026
DNT: A Deeply Normalized Transformer That Can Be Trained by Momentum SGD
Xianbiao Qi
,
Marco Chen
,
Wenjie Xiao
,
Jiaquan Ye
,
Yelin He
,
Chun-Guang Li
,
Zhouchen Lin
ICLR
2025
Taming Transformer Without Using Learning Rate Warmup
Xianbiao Qi
,
Yelin He
,
Jiaquan Ye
,
Chun-Guang Li
,
Bojia Zi
,
Xili Dai
,
Qin Zou
,
Rong Xiao