Mei, Song
39 publications
NeurIPS
2025
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
NeurIPSW
2024
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
NeurIPSW
2024
Choose Your Anchor Wisely: Effective Unlearning Diffusion Models via Concept Reconditioning
NeurIPS
2024
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
NeurIPS
2024
Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
NeurIPSW
2023
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models
NeurIPSW
2023
How Do Transformers Learn In-Context Beyond Simple Functions? a Case Study on Learning with Representations
NeurIPSW
2023
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
NeurIPSW
2023
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
NeurIPS
2023
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
ICMLW
2023
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection