ML Anthology
Authors
Search
About
Malach, Eran
31 publications
ICLR
2025
A New Perspective on Shampoo's Preconditioner
Depen Morwani
,
Itai Shapira
,
Nikhil Vyas
,
Eran Malach
,
Sham M. Kakade
,
Lucas Janson
ICLR
2025
Don’T Stop Me Now: Embedding Based Scheduling for LLMs
Rana Shahout
,
Eran Malach
,
Chunwei Liu
,
Weifan Jiang
,
Minlan Yu
,
Michael Mitzenmacher
NeurIPS
2025
Let Me Think! a Long Chain of Thought Can Be Worth Exponentially Many Short Ones
Parsa Mirtaheri
,
Ezra Edelman
,
Samy Jelassi
,
Eran Malach
,
Enric Boix-Adserà
TMLR
2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
,
Nikhil Anand
,
Nikhil Vyas
,
Eran Malach
,
Sham M. Kakade
ICLR
2025
Mixture of Parrots: Experts Improve Memorization More than Reasoning
Samy Jelassi
,
Clara Mohri
,
David Brandfonbrener
,
Alex Gu
,
Nikhil Vyas
,
Nikhil Anand
,
David Alvarez-Melis
,
Yuanzhi Li
,
Sham M. Kakade
,
Eran Malach
ICML
2025
The Power of Random Features and the Limits of Distribution-Free Gradient Descent
Ari Karchmer
,
Eran Malach
ICML
2025
The Role of Sparsity for Length Generalization in LLMs
Noah Golowich
,
Samy Jelassi
,
David Brandfonbrener
,
Sham M. Kakade
,
Eran Malach
ICML
2025
Universal Length Generalization with Turing Programs
Kaiying Hou
,
David Brandfonbrener
,
Sham M. Kakade
,
Samy Jelassi
,
Eran Malach
ICML
2024
Auto-Regressive Next-Token Predictors Are Universal Learners
Eran Malach
NeurIPSW
2024
Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning
Samy Jelassi
,
Clara Mohri
,
David Brandfonbrener
,
Alex Gu
,
Nikhil Vyas
,
Nikhil Anand
,
David Alvarez-Melis
,
Yuanzhi Li
,
Sham M. Kakade
,
Eran Malach
NeurIPS
2024
On the Power of Decision Trees in Auto-Regressive Language Modeling
Yulu Gan
,
Tomer Galanti
,
Tomaso Poggio
,
Eran Malach
ICML
2024
Repeat After Me: Transformers Are Better than State Space Models at Copying
Samy Jelassi
,
David Brandfonbrener
,
Sham M. Kakade
,
Eran Malach
NeurIPS
2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman
,
Nikolaos Tsilivis
,
Benjamin L. Edelman
,
Eran Malach
,
Surbhi Goel
NeurIPSW
2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman
,
Nikolaos Tsilivis
,
Surbhi Goel
,
Benjamin L. Edelman
,
Eran Malach
NeurIPS
2024
Transcendence: Generative Models Can Outperform the Experts That Train Them
Edwin Zhang
,
Vincent Zhu
,
Naomi Saphra
,
Anat Kleiman
,
Benjamin L. Edelman
,
Milind Tambe
,
Sham Kakade
,
Eran Malach
NeurIPS
2023
Pareto Frontiers in Deep Feature Learning: Data, Compute, Width, and Luck
Benjamin Edelman
,
Surbhi Goel
,
Sham Kakade
,
Eran Malach
,
Cyril Zhang
ICML
2022
Efficient Learning of CNNs Using Patch Based Features
Alon Brutzkus
,
Amir Globerson
,
Eran Malach
,
Alon Regev Netser
,
Shai Shalev-Schwartz
NeurIPS
2022
Hidden Progress in Deep Learning: SGD Learns Parities near the Computational Limit
Boaz Barak
,
Benjamin Edelman
,
Surbhi Goel
,
Sham Kakade
,
Eran Malach
,
Cyril Zhang
NeurIPS
2022
Knowledge Distillation: Bad Models Can Be Good Role Models
Gal Kaplun
,
Eran Malach
,
Preetum Nakkiran
,
Shai Shalev-Shwartz
JMLR
2022
When Hardness of Approximation Meets Hardness of Learning
Eran Malach
,
Shai Shalev-Shwartz
ICLR
2021
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
,
Shai Shalev-Shwartz
NeurIPS
2021
On the Power of Differentiable Learning Versus PAC and SQ Learning
Emmanuel Abbe
,
Pritish Kamath
,
Eran Malach
,
Colin Sandon
,
Nathan Srebro
ICML
2021
Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels
Eran Malach
,
Pritish Kamath
,
Emmanuel Abbe
,
Nathan Srebro
COLT
2021
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks
Eran Malach
,
Gilad Yehudai
,
Shai Shalev-Schwartz
,
Ohad Shamir
COLT
2020
ID3 Learns Juntas for Smoothed Product Distributions
Alon Brutzkus
,
Amit Daniely
,
Eran Malach
NeurIPS
2020
Learning Parities with Neural Networks
Amit Daniely
,
Eran Malach
ICML
2020
Proving the Lottery Ticket Hypothesis: Pruning Is All You Need
Eran Malach
,
Gilad Yehudai
,
Shai Shalev-Schwartz
,
Ohad Shamir
NeurIPS
2020
The Implications of Local Correlation on Learning Some Deep Functions
Eran Malach
,
Shai Shalev-Shwartz
NeurIPS
2019
Is Deeper Better Only When Shallow Is Good?
Eran Malach
,
Shai Shalev-Shwartz
ICLR
2018
SGD Learns Over-Parameterized Networks That Provably Generalize on Linearly Separable Data
Alon Brutzkus
,
Amir Globerson
,
Eran Malach
,
Shai Shalev-Shwartz
NeurIPS
2017
Decoupling "when to Update" from "how to Update"
Eran Malach
,
Shai Shalev-Shwartz