Soudry, Daniel

62 publications

NeurIPS 2025 Alias-Free ViT: Fractional Shift Invariance via Linear Attention Hagay Michaeli, Daniel Soudry
NeurIPS 2025 Are Greedy Task Orderings Better than Random in Continual Linear Regression? Matan Tsipory, Ran Levinstein, Itay Evron, Mark Kong, Deanna Needell, Daniel Soudry
NeurIPS 2025 FP4 All the Way: Fully Quantized Training of Large Language Models Brian Chmiel, Maxim Fishman, Ron Banner, Daniel Soudry
TMLR 2025 Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks Edan Kinderman, Itay Hubara, Haggai Maron, Daniel Soudry
NeurIPS 2025 Optimal Rates in Continual Linear Regression via Increasing Regularization Ran Levinstein, Amit Attia, Matan Schliserman, Uri Sherman, Daniel Soudry, Tomer Koren, Itay Evron
ICLR 2025 Scaling FP8 Training to Trillion-Token LLMs Maxim Fishman, Brian Chmiel, Ron Banner, Daniel Soudry
NeurIPS 2025 Temperature Is All You Need for Generalization in Langevin Dynamics and Other Markov Processes Itamar Harel, Yonathan Wolanowsky, Gal Vardi, Nathan Srebro, Daniel Soudry
NeurIPS 2025 Tensor-Parallelism with Partially Synchronized Activations Itay Lamprecht, Asaf Karnieli, Yair Hanani, Niv Giladi, Daniel Soudry
ICML 2025 When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets Chen Zeno, Hila Manor, Greg Ongie, Nir Weinberger, Tomer Michaeli, Daniel Soudry
NeurIPS 2024 Exponential Quantum Communication Advantage in Distributed Inference and Learning Dar Gilboa, Hagay Michaeli, Daniel Soudry, Jarrod R. McClean
ICMLW 2024 Exponential Quantum Communication Advantage in Distributed Inference and Learning Hagay Michaeli, Dar Gilboa, Daniel Soudry, Jarrod Ryan McClean
ICML 2024 How Uniform Random Weights Induce Non-Uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers Gon Buzaglo, Itamar Harel, Mor Shpigel Nacson, Alon Brutzkus, Nathan Srebro, Daniel Soudry
NeurIPS 2024 Provable Tempered Overfitting of Minimal Nets and Typical Nets Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry
ICMLW 2024 Provable Tempered Overfitting of Minimal Nets and Typical Nets Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry
NeurIPS 2024 Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang
NeurIPS 2024 The Implicit Bias of Gradient Descent on Separable Multiclass Data Hrithik Ravi, Clayton Scott, Daniel Soudry, Yutong Wang
ICLR 2024 The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting — An Analytical Model Daniel Goldfarb, Itay Evron, Nir Weinberger, Daniel Soudry, PAul HAnd
ICLR 2024 Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators Yaniv Blumenfeld, Itay Hubara, Daniel Soudry
ICLR 2023 Accurate Neural Training with 4-Bit Matrix Multiplications at Standard Formats Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben-Yaacov, Daniel Soudry
CVPR 2023 Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations Hagay Michaeli, Tomer Michaeli, Daniel Soudry
ICML 2023 Continual Learning in Linear Classification on Separable Data Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea Marjieh, Nathan Srebro, Daniel Soudry
NeurIPS 2023 DropCompute: Simple and More Robust Distributed Synchronous Training via Compute Variance Reduction Niv Giladi, Shahar Gottlieb, Moran Shkolnik, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Y. Levy, Daniel Soudry
NeurIPS 2023 Explore to Generalize in Zero-Shot RL Ev Zisselman, Itai Lavie, Daniel Soudry, Aviv Tamar
NeurIPSW 2023 Explore to Generalize in Zero-Shot RL Ev Zisselman, Itai Lavie, Daniel Soudry, Aviv Tamar
ICML 2023 Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler, Mor Shpigel Nacson, Daniel Soudry, Yair Carmon
NeurIPS 2023 How Do Minimum-Norm Shallow Denoisers Look in Function Space? Chen Zeno, Greg Ongie, Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry
ICLR 2023 Minimum Variance Unbiased N:M Sparsity for the Neural Gradients Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry
ICLR 2023 The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks Mor Shpigel Nacson, Rotem Mulayoff, Greg Ongie, Tomer Michaeli, Daniel Soudry
AISTATS 2023 The Role of Codeword-to-Class Assignments in Error-Correcting Codes: An Empirical Study Itay Evron, Ophir Onn, Tamar Weiss, Hai Azeroual, Daniel Soudry
NeurIPSW 2023 Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators Yaniv Blumenfeld, Itay Hubara, Daniel Soudry
ICLR 2022 A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks Matan Haroush, Tzviel Frostig, Ruth Heller, Daniel Soudry
COLT 2022 How Catastrophic Can Catastrophic Forgetting Be in Linear Regression? Itay Evron, Edward Moroshko, Rachel Ward, Nathan Srebro, Daniel Soudry
ICML 2022 Implicit Bias of the Step Size in Linear Diagonal Neural Networks Mor Shpigel Nacson, Kavya Ravichandran, Nathan Srebro, Daniel Soudry
AAAI 2022 Regularization Guarantees Generalization in Bayesian Reinforcement Learning Through Algorithmic Stability Aviv Tamar, Daniel Soudry, Ev Zisselman
NeurIPS 2021 Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Joseph Naor, Daniel Soudry
ICML 2021 Accurate Post Training Quantization with Small Calibration Sets Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry
ICLR 2021 Neural Gradients Are Near-Lognormal: Improved Quantized and Sparse Training Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry
ICML 2021 On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay, Edward Moroshko, Mor Shpigel Nacson, Blake E Woodworth, Nathan Srebro, Amir Globerson, Daniel Soudry
NeurIPS 2021 Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling Niv Giladi, Zvika Ben-Haim, Sella Nevo, Yossi Matias, Daniel Soudry
NeurIPS 2021 The Implicit Bias of Minima Stability: A View from Function Space Rotem Mulayoff, Tomer Michaeli, Daniel Soudry
ICLR 2020 A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case Greg Ongie, Rebecca Willett, Daniel Soudry, Nathan Srebro
ICLR 2020 At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry
ICML 2020 Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry
NeurIPS 2020 Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy Edward Moroshko, Blake E Woodworth, Suriya Gunasekar, Jason Lee, Nati Srebro, Daniel Soudry
COLT 2020 Kernel and Rich Regimes in Overparametrized Models Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro
NeurIPS 2019 A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry
AISTATS 2019 Convergence of Gradient Descent on Separable Data Mor Shpigel Nacson, Jason Lee, Suriya Gunasekar, Pedro Henrique Pamplona Savarese, Nathan Srebro, Daniel Soudry
COLT 2019 How Do Infinite Width Bounded Norm Networks Look in Function Space? Pedro Savarese, Itay Evron, Daniel Soudry, Nathan Srebro
ICML 2019 Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models Mor Shpigel Nacson, Suriya Gunasekar, Jason Lee, Nathan Srebro, Daniel Soudry
NeurIPS 2019 Post Training 4-Bit Quantization of Convolutional Networks for Rapid-Deployment Ron Banner, Yury Nahshan, Daniel Soudry
AISTATS 2019 Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson, Nathan Srebro, Daniel Soudry
ICML 2018 Characterizing Implicit Bias in Terms of Optimization Geometry Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro
ICLR 2018 Fix Your Classifier: The Marginal Value of Training the Last Weight Layer Elad Hoffer, Itay Hubara, Daniel Soudry
NeurIPS 2018 Implicit Bias of Gradient Descent on Linear Convolutional Networks Suriya Gunasekar, Jason Lee, Daniel Soudry, Nati Srebro
NeurIPS 2018 Norm Matters: Efficient and Accurate Normalization Schemes in Deep Networks Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry
NeurIPS 2018 Scalable Methods for 8-Bit Training of Neural Networks Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry
ICLR 2018 The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Nathan Srebro
JMLR 2018 The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro
NeurIPS 2017 Train Longer, Generalize Better: Closing the Generalization Gap in Large Batch Training of Neural Networks Elad Hoffer, Itay Hubara, Daniel Soudry
NeurIPS 2016 Binarized Neural Networks Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio
NeurIPS 2014 Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights Daniel Soudry, Itay Hubara, Ron Meir
NeurIPS 2012 Neuronal Spike Generation Mechanism as an Oversampling, Noise-Shaping A-to-D Converter Dmitri B. Chklovskii, Daniel Soudry