Du, Simon Shaolei
60 publications
NeurIPS
2025
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
TMLR
2023
Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning
NeurIPSW
2023
Free from Bellman Completeness: Trajectory Stitching via Model-Based Return-Conditioned Supervised Learning
ICML
2023
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
NeurIPSW
2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
NeurIPSW
2023
LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning
NeurIPSW
2023
On the Synergy Between Label Noise and Learning Rate Annealing in Neural Network Training
TMLR
2023
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization
ICML
2023
Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing
NeurIPSW
2023
Unleashing the Power of Pre-Trained Language Models for Offline Reinforcement Learning