Soudry, Daniel

62 publications

NeurIPS 2025 Alias-Free ViT: Fractional Shift Invariance via Linear Attention Hagay Michaeli, Daniel Soudry

NeurIPS 2025 Are Greedy Task Orderings Better than Random in Continual Linear Regression? Matan Tsipory, Ran Levinstein, Itay Evron, Mark Kong, Deanna Needell, Daniel Soudry

NeurIPS 2025 FP4 All the Way: Fully Quantized Training of Large Language Models Brian Chmiel, Maxim Fishman, Ron Banner, Daniel Soudry

TMLR 2025 Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks Edan Kinderman, Itay Hubara, Haggai Maron, Daniel Soudry

NeurIPS 2025 Optimal Rates in Continual Linear Regression via Increasing Regularization Ran Levinstein, Amit Attia, Matan Schliserman, Uri Sherman, Daniel Soudry, Tomer Koren, Itay Evron

ICLR 2025 Scaling FP8 Training to Trillion-Token LLMs Maxim Fishman, Brian Chmiel, Ron Banner, Daniel Soudry

NeurIPS 2025 Temperature Is All You Need for Generalization in Langevin Dynamics and Other Markov Processes Itamar Harel, Yonathan Wolanowsky, Gal Vardi, Nathan Srebro, Daniel Soudry

NeurIPS 2025 Tensor-Parallelism with Partially Synchronized Activations Itay Lamprecht, Asaf Karnieli, Yair Hanani, Niv Giladi, Daniel Soudry

ICML 2025 When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets Chen Zeno, Hila Manor, Greg Ongie, Nir Weinberger, Tomer Michaeli, Daniel Soudry

NeurIPS 2024 Exponential Quantum Communication Advantage in Distributed Inference and Learning Dar Gilboa, Hagay Michaeli, Daniel Soudry, Jarrod R. McClean

ICMLW 2024 Exponential Quantum Communication Advantage in Distributed Inference and Learning Hagay Michaeli, Dar Gilboa, Daniel Soudry, Jarrod Ryan McClean

ICML 2024 How Uniform Random Weights Induce Non-Uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers Gon Buzaglo, Itamar Harel, Mor Shpigel Nacson, Alon Brutzkus, Nathan Srebro, Daniel Soudry

NeurIPS 2024 Provable Tempered Overfitting of Minimal Nets and Typical Nets Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry

ICMLW 2024 Provable Tempered Overfitting of Minimal Nets and Typical Nets Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry

NeurIPS 2024 Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang

NeurIPS 2024 The Implicit Bias of Gradient Descent on Separable Multiclass Data Hrithik Ravi, Clayton Scott, Daniel Soudry, Yutong Wang

ICLR 2024 The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting — An Analytical Model Daniel Goldfarb, Itay Evron, Nir Weinberger, Daniel Soudry, PAul HAnd

ICLR 2024 Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators Yaniv Blumenfeld, Itay Hubara, Daniel Soudry

ICLR 2023 Accurate Neural Training with 4-Bit Matrix Multiplications at Standard Formats Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben-Yaacov, Daniel Soudry

CVPR 2023 Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations Hagay Michaeli, Tomer Michaeli, Daniel Soudry

ICML 2023 Continual Learning in Linear Classification on Separable Data Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea Marjieh, Nathan Srebro, Daniel Soudry

NeurIPS 2023 DropCompute: Simple and More Robust Distributed Synchronous Training via Compute Variance Reduction Niv Giladi, Shahar Gottlieb, Moran Shkolnik, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Y. Levy, Daniel Soudry

NeurIPS 2023 Explore to Generalize in Zero-Shot RL Ev Zisselman, Itai Lavie, Daniel Soudry, Aviv Tamar

NeurIPSW 2023 Explore to Generalize in Zero-Shot RL Ev Zisselman, Itai Lavie, Daniel Soudry, Aviv Tamar

ICML 2023 Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler, Mor Shpigel Nacson, Daniel Soudry, Yair Carmon

NeurIPS 2023 How Do Minimum-Norm Shallow Denoisers Look in Function Space? Chen Zeno, Greg Ongie, Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry

ICLR 2023 Minimum Variance Unbiased N:M Sparsity for the Neural Gradients Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry

ICLR 2023 The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks Mor Shpigel Nacson, Rotem Mulayoff, Greg Ongie, Tomer Michaeli, Daniel Soudry

AISTATS 2023 The Role of Codeword-to-Class Assignments in Error-Correcting Codes: An Empirical Study Itay Evron, Ophir Onn, Tamar Weiss, Hai Azeroual, Daniel Soudry

NeurIPSW 2023 Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators Yaniv Blumenfeld, Itay Hubara, Daniel Soudry

ICLR 2022 A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks Matan Haroush, Tzviel Frostig, Ruth Heller, Daniel Soudry

COLT 2022 How Catastrophic Can Catastrophic Forgetting Be in Linear Regression? Itay Evron, Edward Moroshko, Rachel Ward, Nathan Srebro, Daniel Soudry

ICML 2022 Implicit Bias of the Step Size in Linear Diagonal Neural Networks Mor Shpigel Nacson, Kavya Ravichandran, Nathan Srebro, Daniel Soudry

AAAI 2022 Regularization Guarantees Generalization in Bayesian Reinforcement Learning Through Algorithmic Stability Aviv Tamar, Daniel Soudry, Ev Zisselman

NeurIPS 2021 Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Joseph Naor, Daniel Soudry

ICML 2021 Accurate Post Training Quantization with Small Calibration Sets Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry

ICLR 2021 Neural Gradients Are Near-Lognormal: Improved Quantized and Sparse Training Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry

ICML 2021 On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay, Edward Moroshko, Mor Shpigel Nacson, Blake E Woodworth, Nathan Srebro, Amir Globerson, Daniel Soudry

NeurIPS 2021 Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling Niv Giladi, Zvika Ben-Haim, Sella Nevo, Yossi Matias, Daniel Soudry

NeurIPS 2021 The Implicit Bias of Minima Stability: A View from Function Space Rotem Mulayoff, Tomer Michaeli, Daniel Soudry

ICLR 2020 A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case Greg Ongie, Rebecca Willett, Daniel Soudry, Nathan Srebro

ICLR 2020 At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry

ICML 2020 Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

NeurIPS 2020 Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy Edward Moroshko, Blake E Woodworth, Suriya Gunasekar, Jason Lee, Nati Srebro, Daniel Soudry

COLT 2020 Kernel and Rich Regimes in Overparametrized Models Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro

NeurIPS 2019 A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

AISTATS 2019 Convergence of Gradient Descent on Separable Data Mor Shpigel Nacson, Jason Lee, Suriya Gunasekar, Pedro Henrique Pamplona Savarese, Nathan Srebro, Daniel Soudry

COLT 2019 How Do Infinite Width Bounded Norm Networks Look in Function Space? Pedro Savarese, Itay Evron, Daniel Soudry, Nathan Srebro

ICML 2019 Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models Mor Shpigel Nacson, Suriya Gunasekar, Jason Lee, Nathan Srebro, Daniel Soudry

NeurIPS 2019 Post Training 4-Bit Quantization of Convolutional Networks for Rapid-Deployment Ron Banner, Yury Nahshan, Daniel Soudry

AISTATS 2019 Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson, Nathan Srebro, Daniel Soudry

ICML 2018 Characterizing Implicit Bias in Terms of Optimization Geometry Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

ICLR 2018 Fix Your Classifier: The Marginal Value of Training the Last Weight Layer Elad Hoffer, Itay Hubara, Daniel Soudry

NeurIPS 2018 Implicit Bias of Gradient Descent on Linear Convolutional Networks Suriya Gunasekar, Jason Lee, Daniel Soudry, Nati Srebro

NeurIPS 2018 Norm Matters: Efficient and Accurate Normalization Schemes in Deep Networks Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry

NeurIPS 2018 Scalable Methods for 8-Bit Training of Neural Networks Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

ICLR 2018 The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Nathan Srebro

JMLR 2018 The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro

NeurIPS 2017 Train Longer, Generalize Better: Closing the Generalization Gap in Large Batch Training of Neural Networks Elad Hoffer, Itay Hubara, Daniel Soudry

NeurIPS 2016 Binarized Neural Networks Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio

NeurIPS 2014 Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights Daniel Soudry, Itay Hubara, Ron Meir

NeurIPS 2012 Neuronal Spike Generation Mechanism as an Oversampling, Noise-Shaping A-to-D Converter Dmitri B. Chklovskii, Daniel Soudry