Daneshmand, Hadi

18 publications

NeurIPS 2025 Linear Transformers Implicitly Discover Unified Numerical Algorithms Patrick Lutz, Aditya Gangrade, Hadi Daneshmand, Venkatesh Saligrama
ICLR 2025 Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning Jiuqi Wang, Ethan Blaser, Hadi Daneshmand, Shangtong Zhang
ICLR 2024 Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Ratsch, Hadi Daneshmand
ICMLW 2024 Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning Jiuqi Wang, Ethan H Blaser, Hadi Daneshmand, Shangtong Zhang
ICML 2023 Efficient Displacement Convex Optimization with Particle Gradient Descent Hadi Daneshmand, Jason D. Lee, Chi Jin
ICML 2023 On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization Amir Joudaki, Hadi Daneshmand, Francis Bach
NeurIPS 2023 On the Impact of Activation and Normalization in Obtaining Isometric Embeddings at Initialization Amir Joudaki, Hadi Daneshmand, Francis R. Bach
NeurIPS 2023 Transformers Learn to Implement Preconditioned Gradient Descent for In-Context Learning Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra
AISTATS 2021 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith
NeurIPS 2021 Batch Normalization Orthogonalizes Representations in Deep Random Networks Hadi Daneshmand, Amir Joudaki, Francis R. Bach
NeurIPS 2021 Rethinking the Variational Interpretation of Accelerated Optimization Methods Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand
NeurIPS 2020 Batch Normalization Provably Avoids Ranks Collapse for Randomly Initialised Deep Networks Hadi Daneshmand, Jonas Kohler, Francis R. Bach, Thomas Hofmann, Aurelien Lucchi
AISTATS 2019 Exponential Convergence Rates for Batch Normalization: The Power of Length-Direction Decoupling in Non-Convex Optimization Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Ming Zhou, Klaus Neymeyr
AISTATS 2019 Local Saddle Point Optimization: A Curvature Exploitation Approach Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
ICML 2018 Escaping Saddles with Stochastic Gradients Hadi Daneshmand, Jonas Kohler, Aurelien Lucchi, Thomas Hofmann
NeurIPS 2016 Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy Aryan Mokhtari, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Alejandro Ribeiro
ICML 2016 Starting Small - Learning with Adaptive Sample Sizes Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
ICML 2014 Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-Thresholding Algorithm Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schoelkopf