Sun, Wen

101 publications

NeurIPS 2025 $Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training Jin Peng Zhou, Kaiwen Wang, Jonathan Daniel Chang, Zhaolin Gao, Nathan Kallus, Kilian Q Weinberger, Kianté Brantley, Wen Sun
ICML 2025 A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents Kaiwen Wang, Dawen Liang, Nathan Kallus, Wen Sun
NeurIPS 2025 Accelerating RL for LLM Reasoning with Optimal Advantage Regression Kianté Brantley, Mingyu Chen, Zhaolin Gao, Jason D. Lee, Wen Sun, Wenhao Zhan, Xuezhou Zhang
NeurIPS 2025 Avoiding exp(R) Scaling in RLHF Through Preference-Based Exploration Mingyu Chen, Yiding Chen, Wen Sun, Xuezhou Zhang
ICLR 2025 Computationally Efficient RL Under Linear Bellman Completeness for Deterministic Dynamics Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy, Wen Sun
ICML 2025 Convergence of Consistency Model with Multistep Sampling Under General Data Assumptions Yiding Chen, Yiyi Zhang, Owen Oertell, Wen Sun
ICLR 2025 Correcting the Mythos of KL-Regularization: Direct Alignment Without Overoptimization via Chi-Squared Preference Optimization Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J Foster
ICLR 2025 Diffusing States and Matching Scores: A New Framework for Imitation Learning Runzhe Wu, Yiding Chen, Gokul Swamy, Kianté Brantley, Wen Sun
ICLR 2025 Efficient Imitation Under Misspecification Nicolas Espinosa-Dice, Sanjiban Choudhury, Wen Sun, Gokul Swamy
ICLR 2025 Model-Based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds Zhiyong Wang, Dongruo Zhou, John C.S. Lui, Wen Sun
ICLR 2025 On Speeding up Language Model Evaluation Jin Peng Zhou, Christian K Belardi, Ruihan Wu, Travis Zhang, Carla P Gomes, Wen Sun, Kilian Q Weinberger
ICLR 2025 Regressing the Relative Future: Efficient Policy Optimization for Multi-Turn RLHF Zhaolin Gao, Wenhao Zhan, Jonathan Daniel Chang, Gokul Swamy, Kianté Brantley, Jason D. Lee, Wen Sun
NeurIPS 2025 Scaling Offline RL via Efficient and Expressive Shortcut Models Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen, Bradley Guo, Owen Oertell, Gokul Swamy, Kianté Brantley, Wen Sun
NeurIPS 2025 Value-Guided Search for Efficient Chain-of-Thought Reasoning Kaiwen Wang, Jin Peng Zhou, Jonathan Daniel Chang, Zhaolin Gao, Nathan Kallus, Kianté Brantley, Wen Sun
ICLR 2024 Adversarial Imitation Learning via Boosting Jonathan Daniel Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun
ICMLW 2024 Efficient Inverse Reinforcement Learning Without Compounding Errors Nicolas Espinosa Dice, Gokul Swamy, Sanjiban Choudhury, Wen Sun
NeurIPS 2024 Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes Andrew Bennett, Nathan Kallus, Miruna Oprescu, Wen Sun, Kaiwen Wang
AISTATS 2024 Faster Recalibration of an Online Predictor via Approachability Princewill Okoroafor, Bobby Kleinberg, Wen Sun
ICLR 2024 Making RL with Preference-Based Feedback Efficient via Randomization Runzhe Wu, Wen Sun
ICML 2024 More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning Kaiwen Wang, Owen Oertell, Alekh Agarwal, Nathan Kallus, Wen Sun
ICLR 2024 Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun
ICLR 2024 Provable Offline Preference-Based Reinforcement Learning Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun
ICLR 2024 Provable Reward-Agnostic Preference-Based Reinforcement Learning Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
ICLR 2024 Provably Efficient CVaR RL in Low-Rank MDPs Yulai Zhao, Wenhao Zhan, Xiaoyan Hu, Ho-fung Leung, Farzan Farnia, Wen Sun, Jason D. Lee
NeurIPS 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
ICMLW 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan Daniel Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
ICMLW 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan Daniel Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
NeurIPS 2024 The Importance of Online Data: Understanding Preference Fine-Tuning via Coverage Yuda Song, Gokul Swamy, Aarti Singh, J. Andrew Bagnell, Wen Sun
ICMLW 2024 The Importance of Online Data: Understanding Preference Fine-Tuning via Coverage Yuda Song, Gokul Swamy, Aarti Singh, Drew Bagnell, Wen Sun
ICML 2023 Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun
NeurIPS 2023 Contextual Bandits and Imitation Learning with Preference-Based Active Queries Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
ICMLW 2023 Contextual Bandits and Imitation Learning with Preference-Based Active Queries Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
ICMLW 2023 Contextual Bandits and Imitation Learning with Preference-Based Active Queries Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
ICML 2023 Distributional Offline Policy Evaluation with Predictive Error Guarantees Runzhe Wu, Masatoshi Uehara, Wen Sun
NeurIPS 2023 Future-Dependent Value-Based Off-Policy Evaluation in POMDPs Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun
ICMLW 2023 How to Query Human Feedback Efficiently in RL? Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
ICMLW 2023 How to Query Human Feedback Efficiently in RL? Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
ICLR 2023 Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient Yuda Song, Yifei Zhou, Ayush Sekhari, Drew Bagnell, Akshay Krishnamurthy, Wen Sun
NeurIPSW 2023 Koopman-Assisted Reinforcement Learning Preston Rozwood, Edward Mehrez, Ludger Paehler, Wen Sun, Steven Brunton
NeurIPSW 2023 Learning to Generate Better than Your LLM Jonathan Chang, Kianté Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun
ICML 2023 Multi-Task Representation Learning for Pure Exploration in Linear Bandits Yihan Du, Longbo Huang, Wen Sun
ICML 2023 Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR Kaiwen Wang, Nathan Kallus, Wen Sun
NeurIPS 2023 Offline Minimax Soft-Q-Learning Under Realizability and Partial Coverage Masatoshi Uehara, Nathan Kallus, Jason Lee, Wen Sun
ICLR 2023 PAC Reinforcement Learning for Predictive State Representations Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
COLT 2023 Provable Benefits of Representational Transfer in Reinforcement Learning Alekh Agarwal, Yuda Song, Wen Sun, Kaiwen Wang, Mengdi Wang, Xuezhou Zhang
ICMLW 2023 Provable Offline Reinforcement Learning with Human Feedback Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun
ICMLW 2023 Provable Offline Reinforcement Learning with Human Feedback Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun
NeurIPSW 2023 Provably Efficient CVaR RL in Low-Rank MDPs Yulai Zhao, Wenhao Zhan, Xiaoyan Hu, Ho-fung Leung, Farzan Farnia, Wen Sun, Jason Lee
ICMLW 2023 Representation Learning in Low-Rank Slate-Based Recommender Systems Yijia Dai, Wen Sun
NeurIPS 2023 Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery Katie Luo, Zhenzhen Liu, Xiangyu Chen, Yurong You, Sagie Benaim, Cheng Perng Phoo, Mark Campbell, Wen Sun, Bharath Hariharan, Kilian Q. Weinberger
NeurIPS 2023 Selective Sampling and Imitation Learning via Online Regression Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
ICMLW 2023 Selective Sampling and Imitation Learning via Online Regression Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
NeurIPS 2023 The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning Kaiwen Wang, Kevin Zhou, Runzhe Wu, Nathan Kallus, Wen Sun
AISTATS 2022 Corruption-Robust Offline Reinforcement Learning Xuezhou Zhang, Yiding Chen, Xiaojin Zhu, Wen Sun
ICML 2022 Efficient Reinforcement Learning in Block MDPs: A Model-Free Representation Learning Approach Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun
ICLR 2022 Hindsight Is 20/20: Leveraging past Traversals to Aid 3D Perception Yurong You, Katie Z Luo, Xiangyu Chen, Junan Chen, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, Kilian Q Weinberger
NeurIPSW 2022 Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient Yuda Song, Yifei Zhou, Ayush Sekhari, Drew Bagnell, Akshay Krishnamurthy, Wen Sun
ICML 2022 Learning Bellman Complete Representations for Offline Policy Evaluation Jonathan Chang, Kaiwen Wang, Nathan Kallus, Wen Sun
CVPR 2022 Learning to Detect Mobile Objects from LiDAR Scans Without Labels Yurong You, Katie Luo, Cheng Perng Phoo, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger
L4DC 2022 On the Effectiveness of Iterative Learning Control Anirudh Vemula, Wen Sun, Maxim Likhachev, J. Andrew Bagnell
L4DC 2022 Online No-Regret Model-Based Meta RL for Personalized Navigation Yuda Song, Yuan Ye, Wen Sun, Kris Kitani
ICLR 2022 Pessimistic Model-Based Offline Reinforcement Learning Under Partial Coverage Masatoshi Uehara, Wen Sun
NeurIPSW 2022 Provable Benefits of Representational Transfer in Reinforcement Learning Alekh Agarwal, Yuda Song, Kaiwen Wang, Mengdi Wang, Wen Sun, Xuezhou Zhang
NeurIPS 2022 Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems Masatoshi Uehara, Ayush Sekhari, Jason Lee, Nathan Kallus, Wen Sun
ICLR 2022 Representation Learning for Online and Offline RL in Low-Rank MDPs Masatoshi Uehara, Xuezhou Zhang, Wen Sun
ICLR 2022 Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design Ye Yuan, Yuda Song, Zhengyi Luo, Wen Sun, Kris M. Kitani
ICML 2021 Bilinear Classes: A Structural Framework for Provable Generalization in RL Simon Du, Sham Kakade, Jason Lee, Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang
COLT 2021 Corruption-Robust Exploration in Episodic Reinforcement Learning Thodoris Lykouris, Max Simchowitz, Alex Slivkins, Wen Sun
ICML 2021 Fairness of Exposure in Stochastic Bandits Lequn Wang, Yiwei Bai, Wen Sun, Thorsten Joachims
NeurIPS 2021 Mitigating Covariate Shift in Imitation Learning via Offline Data with Partial Coverage Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun
NeurIPS 2021 MobILE: Model-Based Imitation Learning from Observation Alone Rahul Kidambi, Jonathan Chang, Wen Sun
ICML 2021 PC-MLP: Model-Based Reinforcement Learning with Policy Cover Guided Exploration Yuda Song, Wen Sun
ICML 2021 Robust Policy Gradient Against Strong Data Corruption Xuezhou Zhang, Yiding Chen, Xiaojin Zhu, Wen Sun
NeurIPS 2020 Constrained Episodic Reinforcement Learning in Concave-Convex and Knapsack Settings Kianté Brantley, Miro Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun
ICLR 2020 Disagreement-Regularized Imitation Learning Kiante Brantley, Wen Sun, Mikael Henaff
NeurIPS 2020 FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun
NeurIPS 2020 Information Theoretic Regret Bounds for Online Nonlinear Control Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun
NeurIPS 2020 Learning the Linear Quadratic Regulator from Nonlinear Observations Zakaria Mhammedi, Dylan J Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford
NeurIPS 2020 Multi-Robot Collision Avoidance Under Uncertainty with Probabilistic Safety Barrier Certificates Wenhao Luo, Wen Sun, Ashish Kapoor
NeurIPS 2020 PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun
ICML 2020 Provably Efficient Model-Based Policy Adaptation Yuda Song, Aditi Mavalankar, Wen Sun, Sicun Gao
ICML 2019 Contextual Memory Trees Wen Sun, Alina Beygelzimer, Hal Daumé Iii, John Langford, Paul Mineiro
AISTATS 2019 Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective Anirudh Vemula, Wen Sun, J. Bagnell
COLT 2019 Model-Based RL in Contextual Decision Processes: PAC Bounds and Exponential Improvements over Model-Free Approaches Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford
NeurIPS 2019 Optimal Sketching for Kronecker Product Regression and Low Rank Approximation Huaian Diao, Rajesh Jayaram, Zhao Song, Wen Sun, David Woodruff
NeurIPS 2019 Policy Poisoning in Batch Reinforcement Learning and Control Yuzhe Ma, Xuezhou Zhang, Wen Sun, Xiaojin Zhu
ICML 2019 Provably Efficient Imitation Learning from Observation Alone Wen Sun, Anirudh Vemula, Byron Boots, Drew Bagnell
NeurIPS 2018 Dual Policy Iteration Wen Sun, Geoffrey J. Gordon, Byron Boots, J. Bagnell
ICML 2018 Recurrent Predictive State Policy Networks Ahmed Hefny, Zita Marinho, Wen Sun, Siddhartha Srinivasa, Geoffrey Gordon
AISTATS 2018 Sketching for Kronecker Product Regression and P-Splines Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff
ICLR 2018 Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning Wen Sun, J. Andrew Bagnell, Byron Boots
ICML 2017 Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell
AISTATS 2017 Gradient Boosting on Stochastic Data Streams Hanzhang Hu, Wen Sun, Arun Venkatraman, Martial Hebert, J. Andrew Bagnell
NeurIPS 2017 Predictive-State Decoders: Encoding the Future into Recurrent Networks Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris Kitani, J. Bagnell
ICML 2017 Safety-Aware Algorithms for Adversarial Contextual Bandit Wen Sun, Debadeepta Dey, Ashish Kapoor
IJCAI 2016 Inference Machines for Nonparametric Filter Learning Arun Venkatraman, Wen Sun, Martial Hebert, Byron Boots, J. Andrew Bagnell
ICML 2016 Learning to Filter with Predictive State Inference Machines Wen Sun, Arun Venkatraman, Byron Boots, J.Andrew Bagnell
UAI 2016 Learning to Smooth with Bidirectional Predictive State Inference Machines Wen Sun, Roberto Capobianco, Geoffrey J. Gordon, J. Andrew Bagnell, Byron Boots
IJCAI 2016 Online Bellman Residual and Temporal Difference Algorithms with Predictive Error Guarantees Wen Sun, J. Andrew Bagnell
AAAI 2016 Online Instrumental Variable Regression with Applications to Online Linear System Identification Arun Venkatraman, Wen Sun, Martial Hebert, J. Andrew Bagnell, Byron Boots
UAI 2015 Online Bellman Residual Algorithms with Predictive Error Guarantees Wen Sun, J. Andrew Bagnell