Du, Simon Shaolei

60 publications

NeurIPS 2025 A Minimalist Example of Edge-of-Stability and Progressive Sharpening Liming Liu, Zixuan Zhang, Simon Shaolei Du, Tuo Zhao
ICML 2025 Cross-Environment Cooperation Enables Zero-Shot Multi-Agent Coordination Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon Shaolei Du, Max Kleiman-Weiner, Natasha Jaques
NeurIPS 2025 Deployment Efficient Reward-Free Exploration with Linear Function Approximation Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon Shaolei Du, Lin Yang, Ruosong Wang
NeurIPS 2025 Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval Siting Li, Xiang Gao, Simon Shaolei Du
ICLRW 2025 Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel
CVPR 2025 Is Your World Simulator a Good Story Presenter? a Consecutive Events-Based Benchmark for Future Long Video Generation Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
ICML 2025 Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon Shaolei Du, Ruosong Wang
AISTATS 2025 Offline Multi-Task Transfer RL with Representational Penalization Avinandan Bose, Simon Shaolei Du, Maryam Fazel
NeurIPS 2025 Reinforcement Learning for Reasoning in Large Language Models with One Training Example Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Liyuan Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang, Simon Shaolei Du, Yelong Shen
NeurIPS 2025 Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs Shulun Chen, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon Shaolei Du
ICLR 2025 The Crucial Role of Samplers in Online Direct Preference Optimization Ruizhe Shi, Runlong Zhou, Simon Shaolei Du
NeurIPS 2025 Understanding the Gain from Data Filtering in Multimodal Contrastive Learning Divyansh Pareek, Sewoong Oh, Simon Shaolei Du
ICLR 2024 A Black-Box Approach for Non-Stationary Multi-Agent Reinforcement Learning Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du
NeurIPS 2024 CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
ICMLW 2024 Decoding-Time Language Model Alignment with Multiple Objectives Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon Shaolei Du
ICLR 2024 Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking Kaifeng Lyu, Jikai Jin, Zhiyuan Li, Simon Shaolei Du, Jason D. Lee, Wei Hu
NeurIPS 2024 Distributional Successor Features Enable Zero-Shot Policy Optimization Chuning Zhu, Xinqi Wang, Tyler Han, Simon Shaolei Du, Abhishek Gupta
ICLR 2024 Free from Bellman Completeness: Trajectory Stitching via Model-Based Return-Conditioned Supervised Learning Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du
ICLR 2024 Horizon-Free Regret for Linear Markov Decision Processes Zihan Zhang, Jason D. Lee, Yuxin Chen, Simon Shaolei Du
ICLR 2024 How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization Nuoya Xiong, Lijun Ding, Simon Shaolei Du
ICLR 2024 JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du
DMLR 2024 LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning Jifan Zhang, Yifang Chen, Gregory Canal, Arnav Mohanty Das, Gantavya Bhatt, Stephen Mussmann, Yinglun Zhu, Jeff Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
NeurIPSW 2024 Learning to Cooperate with Humans Using Generative Agents Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon Shaolei Du, Natasha Jaques
ICML 2024 Rethinking Transformers in Solving POMDPs Chenhao Lu, Ruizhe Shi, Yuyao Liu, Kaizhe Hu, Simon Shaolei Du, Huazhe Xu
NeurIPSW 2024 The Crucial Role of Samplers in Online Direct Preference Optimization Ruizhe Shi, Runlong Zhou, Simon Shaolei Du
NeurIPSW 2024 The Crucial Role of Samplers in Online Direct Preference Optimization Ruizhe Shi, Runlong Zhou, Simon Shaolei Du
ICMLW 2024 Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models Weihang Xu, Maryam Fazel, Simon Shaolei Du
ICMLW 2024 Transferable Reinforcement Learning via Generalized Occupancy Models Chuning Zhu, Xinqi Wang, Tyler Han, Simon Shaolei Du, Abhishek Gupta
ICMLW 2024 Transferable Reinforcement Learning via Generalized Occupancy Models Chuning Zhu, Xinqi Wang, Tyler Han, Simon Shaolei Du, Abhishek Gupta
NeurIPSW 2024 Transformers Are Efficient Compilers, Provably Xiyu Zhai, Runlong Zhou, Liao Zhang, Simon Shaolei Du
ICLR 2024 Unleashing the Power of Pre-Trained Language Models for Offline Reinforcement Learning Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon Shaolei Du, Huazhe Xu
TMLR 2023 Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu
ICMLW 2023 Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation Qiwen Cui, Kaiqing Zhang, Simon Shaolei Du
ICMLW 2023 Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation Qiwen Cui, Kaiqing Zhang, Simon Shaolei Du
ICLR 2023 Faster Last-Iterate Convergence of Policy Optimization in Zero-Sum Markov Games Shicong Cen, Yuejie Chi, Simon Shaolei Du, Lin Xiao
NeurIPSW 2023 Free from Bellman Completeness: Trajectory Stitching via Model-Based Return-Conditioned Supervised Learning Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du
ICML 2023 Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes Runlong Zhou, Ruosong Wang, Simon Shaolei Du
NeurIPSW 2023 How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization Nuoya Xiong, Lijun Ding, Simon Shaolei Du
ICML 2023 Improved Active Multi-Task Representation Learning via Lasso Yiping Wang, Yifang Chen, Kevin Jamieson, Simon Shaolei Du
NeurIPSW 2023 LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning Jifan Zhang, Yifang Chen, Gregory Canal, Arnav Mohanty Das, Gantavya Bhatt, Yinglun Zhu, Stephen Mussmann, Simon Shaolei Du, Jeff Bilmes, Kevin Jamieson, Robert D Nowak
ICLR 2023 Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies Rui Yuan, Simon Shaolei Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao
ICLR 2023 Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du
ICML 2023 On the Power of Pre-Training for Generalization in RL: Provable Benefits and Hardness Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon Shaolei Du
NeurIPSW 2023 On the Synergy Between Label Noise and Learning Rate Annealing in Neural Network Training Stanley Wei, Tongzheng Ren, Simon Shaolei Du
ICML 2023 Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments Runlong Zhou, Zhang Zihan, Simon Shaolei Du
TMLR 2023 Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon Shaolei Du
ICML 2023 Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon Shaolei Du, Jason D. Lee
NeurIPSW 2023 Unleashing the Power of Pre-Trained Language Models for Offline Reinforcement Learning Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon Shaolei Du, Huazhe Xu
ICLR 2023 Variance-Aware Sparse Linear Bandits Yan Dai, Ruosong Wang, Simon Shaolei Du
ICLR 2022 A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon Shaolei Du
AAAI 2022 AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method Xiaoxia Wu, Yuege Xie, Simon Shaolei Du, Rachel A. Ward
ICLR 2022 Provable Adaptation Across Multiway Domains via Representation Learning Zhili Feng, Shaobo Han, Simon Shaolei Du
NeurIPSW 2022 Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization Runlong Zhou, Yuandong Tian, Yi Wu, Simon Shaolei Du
ICLRW 2022 When Is Offline Two-Player Zero-Sum Markov Game Solvable? Qiwen Cui, Simon Shaolei Du
ICLR 2021 Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu
ICLR 2021 Few-Shot Learning via Learning the Representation, Provably Simon Shaolei Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei
ICLR 2021 How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks Keyulu Xu, Mozhi Zhang, Jingling Li, Simon Shaolei Du, Ken-Ichi Kawarabayashi, Stefanie Jegelka
ICLR 2021 Impact of Representation Learning in Linear Bandits Jiaqi Yang, Wei Hu, Jason D. Lee, Simon Shaolei Du
ICLR 2021 Optimism in Reinforcement Learning with Generalized Linear Function Approximation Yining Wang, Ruosong Wang, Simon Shaolei Du, Akshay Krishnamurthy
COLT 2016 An Improved Gap-Dependency Analysis of the Noisy Power Method Maria-Florina Balcan, Simon Shaolei Du, Yining Wang, Adams Wei Yu