Daneshmand, Hadi

18 publications

NeurIPS 2025 Linear Transformers Implicitly Discover Unified Numerical Algorithms Patrick Lutz, Aditya Gangrade, Hadi Daneshmand, Venkatesh Saligrama

ICLR 2025 Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning Jiuqi Wang, Ethan Blaser, Hadi Daneshmand, Shangtong Zhang

ICLR 2024 Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Ratsch, Hadi Daneshmand

ICMLW 2024 Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning Jiuqi Wang, Ethan H Blaser, Hadi Daneshmand, Shangtong Zhang

ICML 2023 Efficient Displacement Convex Optimization with Particle Gradient Descent Hadi Daneshmand, Jason D. Lee, Chi Jin

ICML 2023 On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization Amir Joudaki, Hadi Daneshmand, Francis Bach

NeurIPS 2023 On the Impact of Activation and Normalization in Obtaining Isometric Embeddings at Initialization Amir Joudaki, Hadi Daneshmand, Francis R. Bach

NeurIPS 2023 Transformers Learn to Implement Preconditioned Gradient Descent for In-Context Learning Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra

AISTATS 2021 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith

NeurIPS 2021 Batch Normalization Orthogonalizes Representations in Deep Random Networks Hadi Daneshmand, Amir Joudaki, Francis R. Bach

NeurIPS 2021 Rethinking the Variational Interpretation of Accelerated Optimization Methods Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand

NeurIPS 2020 Batch Normalization Provably Avoids Ranks Collapse for Randomly Initialised Deep Networks Hadi Daneshmand, Jonas Kohler, Francis R. Bach, Thomas Hofmann, Aurelien Lucchi

AISTATS 2019 Exponential Convergence Rates for Batch Normalization: The Power of Length-Direction Decoupling in Non-Convex Optimization Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Ming Zhou, Klaus Neymeyr

AISTATS 2019 Local Saddle Point Optimization: A Curvature Exploitation Approach Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

ICML 2018 Escaping Saddles with Stochastic Gradients Hadi Daneshmand, Jonas Kohler, Aurelien Lucchi, Thomas Hofmann

NeurIPS 2016 Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy Aryan Mokhtari, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Alejandro Ribeiro

ICML 2016 Starting Small - Learning with Adaptive Sample Sizes Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

ICML 2014 Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-Thresholding Algorithm Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schoelkopf