Mei, Jincheng

25 publications

TMLR 2026 Beyond Expectations: Learning with Stochastic Dominance Made Practical Shicong Cen, Jincheng Mei, Hanjun Dai, Dale Schuurmans, Yuejie Chi, Bo Dai
AISTATS 2025 Faster WIND: Accelerating Iterative Best-of-$n$ Distillation for LLM Alignment Tong Yang, Jincheng Mei, Hanjun Dai, Zixin Wen, Shicong Cen, Dale Schuurmans, Yuejie Chi, Bo Dai
NeurIPS 2025 REINFORCE Converges to Optimal Policies with Any Learning Rate Samuel McLaughlin Robertson, Thang D. Chu, Bo Dai, Dale Schuurmans, Csaba Szepesvari, Jincheng Mei
ICLR 2025 Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF Shicong Cen, Jincheng Mei, Katayoon Goshvadi, Hanjun Dai, Tong Yang, Sherry Yang, Dale Schuurmans, Yuejie Chi, Bo Dai
NeurIPS 2024 Small Steps No More: Global Convergence of Stochastic Gradient Bandits for Arbitrary Learning Rates Jincheng Mei, Bo Dai, Alekh Agarwal, Sharan Vaswani, Anant Raj, Csaba Szepesvári, Dale Schuurmans
ICML 2024 Target Networks and Over-Parameterization Stabilize Off-Policy Bootstrapping with Function Approximation Fengdi Che, Chenjun Xiao, Jincheng Mei, Bo Dai, Ramki Gummadi, Oscar A Ramirez, Christopher K Harris, A. Rupam Mahmood, Dale Schuurmans
NeurIPS 2023 Ordering-Based Conditions for Global Convergence of Policy Gradient Methods Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepesvari, Dale Schuurmans
ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Menard, Mohammad Gheshlaghi Azar, Remi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvari, Wataru Kumagai, Yutaka Matsuo
ICML 2023 Stochastic Gradient Succeeds for Bandits Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2022 On the Global Convergence Rates of Decentralized SoftMax Gradient Play in Markov Potential Games Runyu Zhang, Jincheng Mei, Bo Dai, Dale Schuurmans, Na Li
NeurIPS 2022 The Role of Baselines in Policy Gradient Optimization Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvari, Dale Schuurmans
ICLR 2022 Understanding and Leveraging Overparameterization in Recursive Value Estimation Chenjun Xiao, Bo Dai, Jincheng Mei, Oscar A Ramirez, Ramki Gummadi, Chris Harris, Dale Schuurmans
UAI 2022 Understanding and Mitigating the Limitations of Prioritized Experience Replay Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo
ICML 2021 Leveraging Non-Uniformity in First-Order Non-Convex Optimization Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvari, Dale Schuurmans
ICML 2021 On the Optimality of Batch Policy Optimization Algorithms Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2021 Understanding the Effect of Stochasticity in Policy Optimization Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2020 Escaping the Gravitational Pull of SoftMax Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvari, Dale Schuurmans
ICLR 2020 Frequency-Based Search-Control in Dyna Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand
ICML 2020 On the Global Convergence Rates of SoftMax Policy Gradient Methods Jincheng Mei, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2019 Maximum Entropy Monte-Carlo Planning Chenjun Xiao, Ruitong Huang, Jincheng Mei, Dale Schuurmans, Martin Müller
IJCAI 2019 On Principled Entropy Exploration in Policy Optimization Jincheng Mei, Chenjun Xiao, Ruitong Huang, Dale Schuurmans, Martin Müller
AAAI 2018 Memory-Augmented Monte Carlo Tree Search Chenjun Xiao, Jincheng Mei, Martin Müller
AISTATS 2016 On the Reducibility of Submodular Functions Jincheng Mei, Hao Zhang, Bao-Liang Lu
AAAI 2015 On Unconstrained Quasi-Submodular Function Optimization Jincheng Mei, Kang Zhao, Bao-Liang Lu
AAAI 2014 Locality Preserving Hashing Kang Zhao, Hongtao Lu, Jincheng Mei