Li, Yuanzhi

86 publications

ICLR 2025 Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li, Yuanzhi Li

TMLR 2025 Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts Youngseog Chung, Dhruv Malik, Jeff Schneider, Yuanzhi Li, Aarti Singh

ICLR 2025 Mixture of Parrots: Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach

ICML 2025 On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics Binghui Li, Yuanzhi Li

TMLR 2025 Physics of Language Models: Part 1, Learning Hierarchical Language Structures Zeyuan Allen-Zhu, Yuanzhi Li

ICLR 2025 Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu

ICLR 2025 Physics of Language Models: Part 2.2, How to Learn from Mistakes on Grade-School Math Problems Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu

ICLR 2025 Physics of Language Models: Part 3.2, Knowledge Manipulation Zeyuan Allen-Zhu, Yuanzhi Li

ICLR 2025 Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2025 Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability Kaiqi Jiang, Jeremy Cohen, Yuanzhi Li

NeurIPSW 2024 Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li, Yuanzhi Li

NeurIPSW 2024 Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach

ICML 2024 Physics of Language Models: Part 3.1, Knowledge Storage and Extraction Zeyuan Allen-Zhu, Yuanzhi Li

AAAI 2024 Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning Ruiqian Nai, Zixin Wen, Ji Li, Yuanzhi Li, Yang Gao

ICLR 2024 Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation Between CNNs, LCNs, and FCNs Aakash Lahoti, Stefani Karp, Ezra Winston, Aarti Singh, Yuanzhi Li

ICLR 2024 SmartPlay : A Benchmark for LLMs as Intelligent Agents Yue Wu, Xuan Tang, Tom Mitchell, Yuanzhi Li

ICLR 2024 Understanding Transferable Representation Learning and Zero-Shot Transfer in CLIP Zixiang Chen, Yihe Deng, Yuanzhi Li, Quanquan Gu

COLT 2023 Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning Zeyuan Allen-Zhu, Yuanzhi Li

ICLR 2023 Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions Zeyuan Allen-Zhu, Yuanzhi Li

ICML 2023 How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding Yuchen Li, Yuanzhi Li, Andrej Risteski

NeurIPS 2023 How Does Adaptive Optimization Impact Local Neural Network Geometry? Kaiqi Jiang, Dhruv Malik, Yuanzhi Li

NeurIPS 2023 Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals Yue Wu, Yewen Fan, Paul Pu Liang, Amos Azaria, Yuanzhi Li, Tom M. Mitchell

ICLRW 2023 Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals Yue Wu, Yewen Fan, Paul Pu Liang, Amos Azaria, Yuanzhi Li, Tom Mitchell

NeurIPS 2023 SPRING: Studying Papers and Reasoning to Play Games Yue Wu, So Yeon Min, Shrimai Prabhumoye, Yonatan Bisk, Ruslan Salakhutdinov, Amos Azaria, Tom M. Mitchell, Yuanzhi Li

ICLR 2023 Sampling Is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru Zhang

NeurIPSW 2023 SmartPlay : A Benchmark for LLMs as Intelligent Agents Yue Wu, Xuan Tang, Tom Mitchell, Yuanzhi Li

ICML 2023 The Benefits of Mixup for Feature Learning Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

COLT 2023 The Implicit Bias of Batch Normalization in Linear Models and Two-Layer Linear Convolutional Neural Networks Yuan Cao, Difan Zou, Yuanzhi Li, Quanquan Gu

NeurIPS 2023 The Probability Flow ODE Is Provably Fast Sitan Chen, Sinho Chewi, Holden Lee, Yuanzhi Li, Jianfeng Lu, Adil Salim

NeurIPSW 2023 TinyGSM: Achieving 80% on GSM8k with One Billion Parameters Bingbin Liu, Sebastien Bubeck, Ronen Eldan, Janardhan Kulkarni, Yuanzhi Li, Anh Nguyen, Rachel Ward, Yi Zhang

ICLR 2023 Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPSW 2023 Understanding Transferable Representation Learning and Zero-Shot Transfer in CLIP Zixiang Chen, Yihe Deng, Yuanzhi Li, Quanquan Gu

ICLR 2023 Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

ICML 2023 Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality Dhruv Malik, Conor Igoe, Yuanzhi Li, Aarti Singh

COLT 2022 Complete Policy Regret Bounds for Tallying Bandits Dhruv Malik, Yuanzhi Li, Aarti Singh

NeurIPS 2022 Learning (Very) Simple Generative Models Is Hard Sitan Chen, Jerry Li, Yuanzhi Li

ICLR 2022 LoRA: Low-Rank Adaptation of Large Language Models Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen

ICLR 2022 Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs Sitan Chen, Jerry Li, Yuanzhi Li, Raghu Meka

NeurIPSW 2022 Sampling Is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru Zhang

NeurIPS 2022 The Mechanism of Prediction Head in Non-Contrastive Self-Supervised Learning Zixin Wen, Yuanzhi Li

NeurIPSW 2022 Toward Understanding Why Adam Converges Faster than SGD for Transformers Yan Pan, Yuanzhi Li

ICML 2022 Towards Understanding How Momentum Improves Generalization in Deep Learning Samy Jelassi, Yuanzhi Li

NeurIPS 2022 Towards Understanding the Mixture-of-Experts Layer in Deep Learning Zixiang Chen, Yihe Deng, Yue Wu, Quanquan Gu, Yuanzhi Li

NeurIPS 2022 Vision Transformers Provably Learn Spatial Structure Samy Jelassi, Michael Sander, Yuanzhi Li

UAI 2021 A Heuristic for Statistical Seriation Komal Dhull, Jingyan Wang, Nihar B. Shah, Yuanzhi Li, R. Ravi

COLT 2021 A Law of Robustness for Two-Layers Neural Networks Sebastien Bubeck, Yuanzhi Li, Dheeraj M Nagaraj

ICLR 2021 Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability Jeremy Cohen, Simran Kaur, Yuanzhi Li, J Zico Kolter, Ameet Talwalkar

NeurIPS 2021 Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels Stefani Karp, Ezra Winston, Yuanzhi Li, Aarti Singh

ICML 2021 Sample Efficient Reinforcement Learning in Continuous State Spaces: A Perspective Beyond Linearity Dhruv Malik, Aldo Pacchiano, Vishwak Srinivasan, Yuanzhi Li

ICML 2021 Toward Understanding the Feature Learning Process of Self-Supervised Contrastive Learning Zixin Wen, Yuanzhi Li

NeurIPS 2021 When Is Generalizable Reinforcement Learning Tractable? Dhruv Malik, Yuanzhi Li, Pradeep K. Ravikumar

COLT 2020 Learning Over-Parametrized Two-Layer Neural Networks Beyond NTK Yuanzhi Li, Tengyu Ma, Hongyang R. Zhang

COLT 2020 Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate with Collision Information, Sublinear Without Sébastien Bubeck, Yuanzhi Li, Yuval Peres, Mark Sellke

ICML 2019 A Convergence Theory for Deep Learning via Over-Parameterization Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

ICLR 2019 Algorithmic Framework for Model-Based Deep Reinforcement Learning with Theoretical Guarantees Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma

NeurIPS 2019 Can SGD Learn Recurrent Neural Networks with Provable Generalization? Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2019 Complexity of Highly Parallel Non-Smooth Convex Optimization Sebastien Bubeck, Qijia Jiang, Yin-Tat Lee, Yuanzhi Li, Aaron Sidford

COLT 2019 Improved Path-Length Regret Bounds for Bandits Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, Chen-Yu Wei

NeurIPS 2019 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu, Yuanzhi Li, Yingyu Liang

COLT 2019 Near Optimal Methods for Minimizing Convex Functions with Lipschitz $p$-Th Derivatives Alexander Gasnikov, Pavel Dvurechensky, Eduard Gorbunov, Evgeniya Vorontsova, Daniil Selikhanovych, César A. Uribe, Bo Jiang, Haoyue Wang, Shuzhong Zhang, Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford

COLT 2019 Near-Optimal Method for Highly Smooth Convex Optimization Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford

NeurIPS 2019 On the Convergence Rate of Training Recurrent Neural Networks Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song

NeurIPS 2019 Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks Yuanzhi Li, Colin Wei, Tengyu Ma

NeurIPS 2019 What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu, Yuanzhi Li

COLT 2018 Algorithmic Regularization in Over-Parameterized Matrix Sensing and Neural Networks with Quadratic Activations Yuanzhi Li, Tengyu Ma, Hongyang Zhang

ICML 2018 An Alternative View: When Does SGD Escape Local Minima? Bobby Kleinberg, Yuanzhi Li, Yang Yuan

COLT 2018 Learning Mixtures of Linear Regressions with Nearly Optimal Complexity Yuanzhi Li, Yingyu Liang

NeurIPS 2018 Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data Yuanzhi Li, Yingyu Liang

ICML 2018 Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits Zeyuan Allen-Zhu, Sebastien Bubeck, Yuanzhi Li

NeurIPS 2018 NEON2: Finding Local Minima via First-Order Oracles Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2018 Online Improper Learning with an Approximation Oracle Elad Hazan, Wei Hu, Yuanzhi Li, Zhiyuan Li

ALT 2018 Sparsity, Variance and Curvature in Multi-Armed Bandits Sébastien Bubeck, Michael Cohen, Yuanzhi Li

ICML 2018 The Well-Tempered Lasso Yuanzhi Li, Yoram Singer

NeurIPS 2017 Convergence Analysis of Two-Layer Neural Networks with ReLU Activation Yuanzhi Li, Yang Yuan

ICML 2017 Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition Zeyuan Allen-Zhu, Yuanzhi Li

ICML 2017 Faster Principal Component Regression and Stable Matrix Chebyshev Approximation Zeyuan Allen-Zhu, Yuanzhi Li

ICML 2017 Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2017 Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls Zeyuan Allen-Zhu, Elad Hazan, Wei Hu, Yuanzhi Li

ICML 2017 Near-Optimal Design of Experiments via Regret Minimization Zeyuan Allen-Zhu, Yuanzhi Li, Aarti Singh, Yining Wang

ICML 2017 Provable Alternating Gradient Descent for Non-Negative Matrix Factorization with Strong Correlations Yuanzhi Li, Yingyu Liang

NeurIPS 2016 Algorithms and Matching Lower Bounds for Approximately-Convex Optimization Andrej Risteski, Yuanzhi Li

NeurIPS 2016 Approximate Maximum Entropy Principles via Goemans-Williamson with Applications to Provable Variational Methods Andrej Risteski, Yuanzhi Li

NeurIPS 2016 LazySVD: Even Faster SVD Decomposition yet Without Agonizing Pain Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2016 Recovery Guarantee of Non-Negative Matrix Factorization via Alternating Updates Yuanzhi Li, Yingyu Liang, Andrej Risteski

ICML 2016 Recovery Guarantee of Weighted Low-Rank Approximation via Alternating Minimization Yuanzhi Li, Yingyu Liang, Andrej Risteski

COLT 2013 A Theoretical Analysis of NDCG Type Ranking Measures Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Tie-Yan Liu