Li, Yuanzhi

86 publications

ICLR 2025 Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li, Yuanzhi Li
TMLR 2025 Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts Youngseog Chung, Dhruv Malik, Jeff Schneider, Yuanzhi Li, Aarti Singh
ICLR 2025 Mixture of Parrots: Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
ICML 2025 On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics Binghui Li, Yuanzhi Li
TMLR 2025 Physics of Language Models: Part 1, Learning Hierarchical Language Structures Zeyuan Allen-Zhu, Yuanzhi Li
ICLR 2025 Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu
ICLR 2025 Physics of Language Models: Part 2.2, How to Learn from Mistakes on Grade-School Math Problems Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu
ICLR 2025 Physics of Language Models: Part 3.2, Knowledge Manipulation Zeyuan Allen-Zhu, Yuanzhi Li
ICLR 2025 Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPS 2025 Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability Kaiqi Jiang, Jeremy Cohen, Yuanzhi Li
NeurIPSW 2024 Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li, Yuanzhi Li
NeurIPSW 2024 Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
ICML 2024 Physics of Language Models: Part 3.1, Knowledge Storage and Extraction Zeyuan Allen-Zhu, Yuanzhi Li
AAAI 2024 Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning Ruiqian Nai, Zixin Wen, Ji Li, Yuanzhi Li, Yang Gao
ICLR 2024 Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation Between CNNs, LCNs, and FCNs Aakash Lahoti, Stefani Karp, Ezra Winston, Aarti Singh, Yuanzhi Li
ICLR 2024 SmartPlay : A Benchmark for LLMs as Intelligent Agents Yue Wu, Xuan Tang, Tom Mitchell, Yuanzhi Li
ICLR 2024 Understanding Transferable Representation Learning and Zero-Shot Transfer in CLIP Zixiang Chen, Yihe Deng, Yuanzhi Li, Quanquan Gu
COLT 2023 Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning Zeyuan Allen-Zhu, Yuanzhi Li
ICLR 2023 Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions Zeyuan Allen-Zhu, Yuanzhi Li
ICML 2023 How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding Yuchen Li, Yuanzhi Li, Andrej Risteski
NeurIPS 2023 How Does Adaptive Optimization Impact Local Neural Network Geometry? Kaiqi Jiang, Dhruv Malik, Yuanzhi Li
NeurIPS 2023 Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals Yue Wu, Yewen Fan, Paul Pu Liang, Amos Azaria, Yuanzhi Li, Tom M. Mitchell
ICLRW 2023 Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals Yue Wu, Yewen Fan, Paul Pu Liang, Amos Azaria, Yuanzhi Li, Tom Mitchell
NeurIPS 2023 SPRING: Studying Papers and Reasoning to Play Games Yue Wu, So Yeon Min, Shrimai Prabhumoye, Yonatan Bisk, Ruslan Salakhutdinov, Amos Azaria, Tom M. Mitchell, Yuanzhi Li
ICLR 2023 Sampling Is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru Zhang
NeurIPSW 2023 SmartPlay : A Benchmark for LLMs as Intelligent Agents Yue Wu, Xuan Tang, Tom Mitchell, Yuanzhi Li
ICML 2023 The Benefits of Mixup for Feature Learning Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu
COLT 2023 The Implicit Bias of Batch Normalization in Linear Models and Two-Layer Linear Convolutional Neural Networks Yuan Cao, Difan Zou, Yuanzhi Li, Quanquan Gu
NeurIPS 2023 The Probability Flow ODE Is Provably Fast Sitan Chen, Sinho Chewi, Holden Lee, Yuanzhi Li, Jianfeng Lu, Adil Salim
NeurIPSW 2023 TinyGSM: Achieving 80% on GSM8k with One Billion Parameters Bingbin Liu, Sebastien Bubeck, Ronen Eldan, Janardhan Kulkarni, Yuanzhi Li, Anh Nguyen, Rachel Ward, Yi Zhang
ICLR 2023 Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPSW 2023 Understanding Transferable Representation Learning and Zero-Shot Transfer in CLIP Zixiang Chen, Yihe Deng, Yuanzhi Li, Quanquan Gu
ICLR 2023 Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu
ICML 2023 Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality Dhruv Malik, Conor Igoe, Yuanzhi Li, Aarti Singh
COLT 2022 Complete Policy Regret Bounds for Tallying Bandits Dhruv Malik, Yuanzhi Li, Aarti Singh
NeurIPS 2022 Learning (Very) Simple Generative Models Is Hard Sitan Chen, Jerry Li, Yuanzhi Li
ICLR 2022 LoRA: Low-Rank Adaptation of Large Language Models Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen
ICLR 2022 Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs Sitan Chen, Jerry Li, Yuanzhi Li, Raghu Meka
NeurIPSW 2022 Sampling Is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru Zhang
NeurIPS 2022 The Mechanism of Prediction Head in Non-Contrastive Self-Supervised Learning Zixin Wen, Yuanzhi Li
NeurIPSW 2022 Toward Understanding Why Adam Converges Faster than SGD for Transformers Yan Pan, Yuanzhi Li
ICML 2022 Towards Understanding How Momentum Improves Generalization in Deep Learning Samy Jelassi, Yuanzhi Li
NeurIPS 2022 Towards Understanding the Mixture-of-Experts Layer in Deep Learning Zixiang Chen, Yihe Deng, Yue Wu, Quanquan Gu, Yuanzhi Li
NeurIPS 2022 Vision Transformers Provably Learn Spatial Structure Samy Jelassi, Michael Sander, Yuanzhi Li
UAI 2021 A Heuristic for Statistical Seriation Komal Dhull, Jingyan Wang, Nihar B. Shah, Yuanzhi Li, R. Ravi
COLT 2021 A Law of Robustness for Two-Layers Neural Networks Sebastien Bubeck, Yuanzhi Li, Dheeraj M Nagaraj
ICLR 2021 Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability Jeremy Cohen, Simran Kaur, Yuanzhi Li, J Zico Kolter, Ameet Talwalkar
NeurIPS 2021 Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels Stefani Karp, Ezra Winston, Yuanzhi Li, Aarti Singh
ICML 2021 Sample Efficient Reinforcement Learning in Continuous State Spaces: A Perspective Beyond Linearity Dhruv Malik, Aldo Pacchiano, Vishwak Srinivasan, Yuanzhi Li
ICML 2021 Toward Understanding the Feature Learning Process of Self-Supervised Contrastive Learning Zixin Wen, Yuanzhi Li
NeurIPS 2021 When Is Generalizable Reinforcement Learning Tractable? Dhruv Malik, Yuanzhi Li, Pradeep K. Ravikumar
COLT 2020 Learning Over-Parametrized Two-Layer Neural Networks Beyond NTK Yuanzhi Li, Tengyu Ma, Hongyang R. Zhang
COLT 2020 Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate with Collision Information, Sublinear Without Sébastien Bubeck, Yuanzhi Li, Yuval Peres, Mark Sellke
ICML 2019 A Convergence Theory for Deep Learning via Over-Parameterization Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song
ICLR 2019 Algorithmic Framework for Model-Based Deep Reinforcement Learning with Theoretical Guarantees Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma
NeurIPS 2019 Can SGD Learn Recurrent Neural Networks with Provable Generalization? Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPS 2019 Complexity of Highly Parallel Non-Smooth Convex Optimization Sebastien Bubeck, Qijia Jiang, Yin-Tat Lee, Yuanzhi Li, Aaron Sidford
COLT 2019 Improved Path-Length Regret Bounds for Bandits Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, Chen-Yu Wei
NeurIPS 2019 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu, Yuanzhi Li, Yingyu Liang
COLT 2019 Near Optimal Methods for Minimizing Convex Functions with Lipschitz $p$-Th Derivatives Alexander Gasnikov, Pavel Dvurechensky, Eduard Gorbunov, Evgeniya Vorontsova, Daniil Selikhanovych, César A. Uribe, Bo Jiang, Haoyue Wang, Shuzhong Zhang, Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford
COLT 2019 Near-Optimal Method for Highly Smooth Convex Optimization Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford
NeurIPS 2019 On the Convergence Rate of Training Recurrent Neural Networks Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song
NeurIPS 2019 Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks Yuanzhi Li, Colin Wei, Tengyu Ma
NeurIPS 2019 What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu, Yuanzhi Li
COLT 2018 Algorithmic Regularization in Over-Parameterized Matrix Sensing and Neural Networks with Quadratic Activations Yuanzhi Li, Tengyu Ma, Hongyang Zhang
ICML 2018 An Alternative View: When Does SGD Escape Local Minima? Bobby Kleinberg, Yuanzhi Li, Yang Yuan
COLT 2018 Learning Mixtures of Linear Regressions with Nearly Optimal Complexity Yuanzhi Li, Yingyu Liang
NeurIPS 2018 Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data Yuanzhi Li, Yingyu Liang
ICML 2018 Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits Zeyuan Allen-Zhu, Sebastien Bubeck, Yuanzhi Li
NeurIPS 2018 NEON2: Finding Local Minima via First-Order Oracles Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPS 2018 Online Improper Learning with an Approximation Oracle Elad Hazan, Wei Hu, Yuanzhi Li, Zhiyuan Li
ALT 2018 Sparsity, Variance and Curvature in Multi-Armed Bandits Sébastien Bubeck, Michael Cohen, Yuanzhi Li
ICML 2018 The Well-Tempered Lasso Yuanzhi Li, Yoram Singer
NeurIPS 2017 Convergence Analysis of Two-Layer Neural Networks with ReLU Activation Yuanzhi Li, Yang Yuan
ICML 2017 Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition Zeyuan Allen-Zhu, Yuanzhi Li
ICML 2017 Faster Principal Component Regression and Stable Matrix Chebyshev Approximation Zeyuan Allen-Zhu, Yuanzhi Li
ICML 2017 Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPS 2017 Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls Zeyuan Allen-Zhu, Elad Hazan, Wei Hu, Yuanzhi Li
ICML 2017 Near-Optimal Design of Experiments via Regret Minimization Zeyuan Allen-Zhu, Yuanzhi Li, Aarti Singh, Yining Wang
ICML 2017 Provable Alternating Gradient Descent for Non-Negative Matrix Factorization with Strong Correlations Yuanzhi Li, Yingyu Liang
NeurIPS 2016 Algorithms and Matching Lower Bounds for Approximately-Convex Optimization Andrej Risteski, Yuanzhi Li
NeurIPS 2016 Approximate Maximum Entropy Principles via Goemans-Williamson with Applications to Provable Variational Methods Andrej Risteski, Yuanzhi Li
NeurIPS 2016 LazySVD: Even Faster SVD Decomposition yet Without Agonizing Pain Zeyuan Allen-Zhu, Yuanzhi Li
NeurIPS 2016 Recovery Guarantee of Non-Negative Matrix Factorization via Alternating Updates Yuanzhi Li, Yingyu Liang, Andrej Risteski
ICML 2016 Recovery Guarantee of Weighted Low-Rank Approximation via Alternating Minimization Yuanzhi Li, Yingyu Liang, Andrej Risteski
COLT 2013 A Theoretical Analysis of NDCG Type Ranking Measures Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Tie-Yan Liu