Wu, Denny
37 publications
NeurIPS
2025
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
COLT
2025
Mean-Field Analysis of Polynomial-Width Two-Layer Neural Network Beyond Finite Time Horizon
ICML
2025
Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation
NeurIPS
2025
When Do Transformers Outperform Feedforward and Recurrent Networks? a Statistical Perspective
ICMLW
2024
Neural Network Learns Low-Dimensional Polynomials with SGD near the Information-Theoretic Limit
NeurIPS
2024
Neural Network Learns Low-Dimensional Polynomials with SGD near the Information-Theoretic Limit
NeurIPS
2023
Convergence of Mean-Field Langevin Dynamics: Time-Space Discretization, Stochastic Gradient, and Variance Reduction
NeurIPS
2023
Feature Learning via Mean-Field Langevin Dynamics: Classifying Sparse Parities and Beyond
NeurIPS
2023
Learning in the Presence of Low-Dimensional Structure: A Spiked Random Matrix Perspective
NeurIPS
2022
High-Dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
NeurIPS
2022
Two-Layer Neural Network on Infinite Dimensional Data: Global Optimization Guarantee in the Mean-Field Regime