Lee, Jason D.
86 publications
ICML
2025
Discrepancies Are Virtue: Weak-to-Strong Generalization Through Lens of Intrinsic Dimension
ICML
2025
Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation
ICML
2025
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
NeurIPS
2025
The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
NeurIPS
2025
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
ICMLW
2024
Neural Network Learns Low-Dimensional Polynomials with SGD near the Information-Theoretic Limit
NeurIPS
2024
Neural Network Learns Low-Dimensional Polynomials with SGD near the Information-Theoretic Limit
NeurIPS
2024
Stochastic Zeroth-Order Optimization Under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
ICML
2023
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
ICLR
2023
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games
AISTATS
2023
Optimal Sample Complexity Bounds for Non-Convex Optimization Under Kurdyka-Lojasiewicz Condition
ICMLW
2023
Reward Collapse in Aligning Large Language Models: A Prompt-Aware Approach to Preference Rankings
ICML
2023
Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing
JMLR
2021
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
JMLR
2017
Distributed Stochastic Variance Reduced Gradient Methods by Sampling Extra Data with Replacement