Zhu, Libin

9 publications

ICML 2025 Emergence in Non-Neural Models: Grokking Modular Arithmetic via Average Gradient Outer Product Neil Rohit Mallinar, Daniel Beaglehole, Libin Zhu, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

ICML 2024 Catapults in SGD: Spikes in the Training Loss and Their Impact on Generalization Through Feature Learning Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

NeurIPSW 2024 Emergence in Non-Neural Models: Grokking Modular Arithmetic via Average Gradient Outer Product Neil Rohit Mallinar, Daniel Beaglehole, Libin Zhu, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

ICLR 2024 Quadratic Models for Understanding Catapult Dynamics of Neural Networks Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

UAI 2023 Neural Tangent Kernel at Initialization: Linear Width Suffices Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Mikhail Belkin

ICLR 2023 Restricted Strong Convexity of Deep Learning Models with Smooth Activations Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Misha Belkin

NeurIPS 2022 Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu, Chaoyue Liu, Misha Belkin

ICLR 2022 Transition to Linearity of Wide Neural Networks Is an Emerging Property of Assembling Weak Models Chaoyue Liu, Libin Zhu, Misha Belkin

NeurIPS 2020 On the Linearity of Large Non-Linear Models: When and Why the Tangent Kernel Is Constant Chaoyue Liu, Libin Zhu, Misha Belkin