Soudry, Daniel
62 publications
TMLR
2025
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
NeurIPS
2025
Temperature Is All You Need for Generalization in Langevin Dynamics and Other Markov Processes
NeurIPS
2024
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
NeurIPS
2023
DropCompute: Simple and More Robust Distributed Synchronous Training via Compute Variance Reduction
AISTATS
2023
The Role of Codeword-to-Class Assignments in Error-Correcting Codes: An Empirical Study
ICLR
2022
A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks
NeurIPS
2021
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
ICML
2020
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
AISTATS
2019
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate