Wu, Lei
30 publications
CVPR
2025
BlueLM-V-3b: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices
NeurIPS
2025
Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules
ICML
2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
IJCAI
2023
Learning to Self-Reconfigure for Freeform Modular Robots via Altruism Proximal Policy Optimization
NeurIPSW
2023
The Noise Geometry of Stochastic Gradient Descent: A Quantitative and Analytical Characterization
JMLR
2022
A Spectral-Based Analysis of the Separation Between Two-Layer Neural Networks and Linear Methods
NeurIPS
2022
The Alignment Property of SGD Noise and How It Helps Select Flat Minima: A Stability Analysis
NeurIPS
2018
How SGD Selects the Global Minima in Over-Parameterized Learning: A Dynamical Stability Perspective