Xie, Tengyang

23 publications

ICLR 2025 Correcting the Mythos of KL-Regularization: Direct Alignment Without Overoptimization via Chi-Squared Preference Optimization Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J Foster
ICML 2025 Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective Zeyu Jia, Alexander Rakhlin, Tengyang Xie
ICLR 2025 Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF Tengyang Xie, Dylan J Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Hassan Awadallah, Alexander Rakhlin
NeurIPS 2025 Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits Fan Chen, Zeyu Jia, Alexander Rakhlin, Tengyang Xie
ICML 2025 Reinforce LLM Reasoning Through Multi-Agent Reflection Yurun Yuan, Tengyang Xie
NeurIPS 2025 Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning Yurun Yuan, Fan Chen, Zeyu Jia, Alexander Rakhlin, Tengyang Xie
ICLR 2024 Harnessing Density Ratios for Online Reinforcement Learning Philip Amortila, Dylan J Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie
ICML 2024 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar
ICLR 2024 Towards Principled Representation Learning from Videos for Reinforcement Learning Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford
NeurIPSW 2024 Towards Principled Representation Learning from Videos for Reinforcement Learning Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford
NeurIPS 2023 Adversarial Model for Offline Reinforcement Learning Mohak Bhardwaj, Tengyang Xie, Byron Boots, Nan Jiang, Ching-An Cheng
ICLR 2023 The Role of Coverage in Online Reinforcement Learning Tengyang Xie, Dylan J Foster, Yu Bai, Nan Jiang, Sham M. Kakade
NeurIPSW 2022 AMORE: A Model-Based Framework for Improving Arbitrary Baseline Policies with Offline Data Tengyang Xie, Mohak Bhardwaj, Nan Jiang, Ching-An Cheng
ICML 2022 Adversarially Trained Actor Critic for Offline Reinforcement Learning Ching-An Cheng, Tengyang Xie, Nan Jiang, Alekh Agarwal
NeurIPS 2022 Interaction-Grounded Learning with Action-Inclusive Feedback Tengyang Xie, Akanksha Saran, Dylan J Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford
ICML 2021 Batch Value-Function Approximation with Only Realizability Tengyang Xie, Nan Jiang
NeurIPS 2021 Bellman-Consistent Pessimism for Offline Reinforcement Learning Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal
ICML 2021 Interaction-Grounded Learning Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad
NeurIPS 2021 Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai
UAI 2020 Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison Tengyang Xie, Nan Jiang
NeurIPS 2019 Provably Efficient Q-Learning with Low Switching Cost Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang
NeurIPS 2019 Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling Tengyang Xie, Yifei Ma, Yu-Xiang Wang
NeurIPS 2018 A Block Coordinate Ascent Algorithm for Mean-Variance Optimization Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon