Malach, Eran

31 publications

ICLR 2025 A New Perspective on Shampoo's Preconditioner Depen Morwani, Itai Shapira, Nikhil Vyas, Eran Malach, Sham M. Kakade, Lucas Janson
ICLR 2025 Don’T Stop Me Now: Embedding Based Scheduling for LLMs Rana Shahout, Eran Malach, Chunwei Liu, Weifan Jiang, Minlan Yu, Michael Mitzenmacher
NeurIPS 2025 Let Me Think! a Long Chain of Thought Can Be Worth Exponentially Many Short Ones Parsa Mirtaheri, Ezra Edelman, Samy Jelassi, Eran Malach, Enric Boix-Adserà
TMLR 2025 Loss-to-Loss Prediction: Scaling Laws for All Datasets David Brandfonbrener, Nikhil Anand, Nikhil Vyas, Eran Malach, Sham M. Kakade
ICLR 2025 Mixture of Parrots: Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
ICML 2025 The Power of Random Features and the Limits of Distribution-Free Gradient Descent Ari Karchmer, Eran Malach
ICML 2025 The Role of Sparsity for Length Generalization in LLMs Noah Golowich, Samy Jelassi, David Brandfonbrener, Sham M. Kakade, Eran Malach
ICML 2025 Universal Length Generalization with Turing Programs Kaiying Hou, David Brandfonbrener, Sham M. Kakade, Samy Jelassi, Eran Malach
ICML 2024 Auto-Regressive Next-Token Predictors Are Universal Learners Eran Malach
NeurIPSW 2024 Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
NeurIPS 2024 On the Power of Decision Trees in Auto-Regressive Language Modeling Yulu Gan, Tomer Galanti, Tomaso Poggio, Eran Malach
ICML 2024 Repeat After Me: Transformers Are Better than State Space Models at Copying Samy Jelassi, David Brandfonbrener, Sham M. Kakade, Eran Malach
NeurIPS 2024 The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains Ezra Edelman, Nikolaos Tsilivis, Benjamin L. Edelman, Eran Malach, Surbhi Goel
NeurIPSW 2024 The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains Ezra Edelman, Nikolaos Tsilivis, Surbhi Goel, Benjamin L. Edelman, Eran Malach
NeurIPS 2024 Transcendence: Generative Models Can Outperform the Experts That Train Them Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham Kakade, Eran Malach
NeurIPS 2023 Pareto Frontiers in Deep Feature Learning: Data, Compute, Width, and Luck Benjamin Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang
ICML 2022 Efficient Learning of CNNs Using Patch Based Features Alon Brutzkus, Amir Globerson, Eran Malach, Alon Regev Netser, Shai Shalev-Schwartz
NeurIPS 2022 Hidden Progress in Deep Learning: SGD Learns Parities near the Computational Limit Boaz Barak, Benjamin Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang
NeurIPS 2022 Knowledge Distillation: Bad Models Can Be Good Role Models Gal Kaplun, Eran Malach, Preetum Nakkiran, Shai Shalev-Shwartz
JMLR 2022 When Hardness of Approximation Meets Hardness of Learning Eran Malach, Shai Shalev-Shwartz
ICLR 2021 Computational Separation Between Convolutional and Fully-Connected Networks Eran Malach, Shai Shalev-Shwartz
NeurIPS 2021 On the Power of Differentiable Learning Versus PAC and SQ Learning Emmanuel Abbe, Pritish Kamath, Eran Malach, Colin Sandon, Nathan Srebro
ICML 2021 Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels Eran Malach, Pritish Kamath, Emmanuel Abbe, Nathan Srebro
COLT 2021 The Connection Between Approximation, Depth Separation and Learnability in Neural Networks Eran Malach, Gilad Yehudai, Shai Shalev-Schwartz, Ohad Shamir
COLT 2020 ID3 Learns Juntas for Smoothed Product Distributions Alon Brutzkus, Amit Daniely, Eran Malach
NeurIPS 2020 Learning Parities with Neural Networks Amit Daniely, Eran Malach
ICML 2020 Proving the Lottery Ticket Hypothesis: Pruning Is All You Need Eran Malach, Gilad Yehudai, Shai Shalev-Schwartz, Ohad Shamir
NeurIPS 2020 The Implications of Local Correlation on Learning Some Deep Functions Eran Malach, Shai Shalev-Shwartz
NeurIPS 2019 Is Deeper Better Only When Shallow Is Good? Eran Malach, Shai Shalev-Shwartz
ICLR 2018 SGD Learns Over-Parameterized Networks That Provably Generalize on Linearly Separable Data Alon Brutzkus, Amir Globerson, Eran Malach, Shai Shalev-Shwartz
NeurIPS 2017 Decoupling "when to Update" from "how to Update" Eran Malach, Shai Shalev-Shwartz