Daneshmand, Hadi
18 publications
NeurIPS
2023
On the Impact of Activation and Normalization in Obtaining Isometric Embeddings at Initialization
NeurIPS
2023
Transformers Learn to Implement Preconditioned Gradient Descent for In-Context Learning
AISTATS
2021
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization