Xu, Tengyu

16 publications

ICML 2025 Think Smarter Not Harder: Adaptive Reasoning with Inference Aware Optimization Zishun Yu, Tengyu Xu, Di Jin, Karthik Abinav Sankararaman, Yun He, Wenxuan Zhou, Zhouhao Zeng, Eryk Helenowski, Chen Zhu, Sinong Wang, Hao Ma, Han Fang
ICLRW 2025 Think Smarter Not Harder: Adaptive Reasoning with Inference Aware Optimization Zishun Yu, Tengyu Xu, Di Jin, Karthik Abinav Sankararaman, Yun He, Wenxuan Zhou, Zhouhao Zeng, Eryk Helenowski, Chen Zhu, Sinong Wang, Hao Ma, Han Fang
NeurIPS 2022 A Unifying Framework of Off-Policy General Value Function Evaluation Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang
UAI 2022 Deterministic Policy Gradient: Convergence Analysis Huaqing. Xiong, Tengyu Xu, Lin Zhao, Yingbin Liang, Wei Zhang
ICLR 2022 Model-Based Offline Meta-Reinforcement Learning with Regularization Sen Lin, Jialin Wan, Tengyu Xu, Yingbin Liang, Junshan Zhang
ICLR 2022 PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method Ziwei Guan, Tengyu Xu, Yingbin Liang
AISTATS 2021 Sample Complexity Bounds for Two Timescale Value-Based Reinforcement Learning Algorithms Tengyu Xu, Yingbin Liang
AISTATS 2021 When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence Ziwei Guan, Tengyu Xu, Yingbin Liang
ICML 2021 CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee Tengyu Xu, Yingbin Liang, Guanghui Lan
ICML 2021 Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang
AAAI 2021 Non-Asymptotic Convergence of Adam-Type Reinforcement Learning Algorithms Under Markovian Sampling Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei Zhang
ICLR 2021 Proximal Gradient Descent-Ascent: Variable Convergence Under KŁ Geometry Ziyi Chen, Yi Zhou, Tengyu Xu, Yingbin Liang
NeurIPS 2020 Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms Tengyu Xu, Zhe Wang, Yingbin Liang
ICLR 2020 Reanalysis of Variance Reduced Temporal Difference Learning Tengyu Xu, Zhe Wang, Yi Zhou, Yingbin Liang
NeurIPS 2019 Finite-Sample Analysis for SARSA with Linear Function Approximation Shaofeng Zou, Tengyu Xu, Yingbin Liang
NeurIPS 2019 Two Time-Scale Off-Policy TD Learning: Non-Asymptotic Analysis over Markovian Samples Tengyu Xu, Shaofeng Zou, Yingbin Liang