ML Anthology
Authors
Search
About
Xiao, Wenjie
1 publications
ICLR
2026
DNT: A Deeply Normalized Transformer That Can Be Trained by Momentum SGD
Xianbiao Qi
,
Marco Chen
,
Wenjie Xiao
,
Jiaquan Ye
,
Yelin He
,
Chun-Guang Li
,
Zhouchen Lin