E, Weinan
14 publications
ICML
2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
JMLR
2022
Approximation and Optimization Theory for Linear Continuous-Time Recurrent Neural Networks
ICLR
2021
On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis
NeurIPS
2020
Towards Theoretically Understanding Why SGD Generalizes Better than Adam in Deep Learning
NeurIPS
2018
End-to-End Symmetry Preserving Inter-Atomic Potential Energy Model for Finite and Extended Systems