Brantley, Kiante

21 publications

NeurIPS 2025 $Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training Jin Peng Zhou, Kaiwen Wang, Jonathan Daniel Chang, Zhaolin Gao, Nathan Kallus, Kilian Q Weinberger, Kianté Brantley, Wen Sun
NeurIPS 2025 Accelerating RL for LLM Reasoning with Optimal Advantage Regression Kianté Brantley, Mingyu Chen, Zhaolin Gao, Jason D. Lee, Wen Sun, Wenhao Zhan, Xuezhou Zhang
ICLR 2025 Diffusing States and Matching Scores: A New Framework for Imitation Learning Runzhe Wu, Yiding Chen, Gokul Swamy, Kianté Brantley, Wen Sun
ICLR 2025 Regressing the Relative Future: Efficient Policy Optimization for Multi-Turn RLHF Zhaolin Gao, Wenhao Zhan, Jonathan Daniel Chang, Gokul Swamy, Kianté Brantley, Jason D. Lee, Wen Sun
NeurIPS 2025 Scaling Offline RL via Efficient and Expressive Shortcut Models Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen, Bradley Guo, Owen Oertell, Gokul Swamy, Kianté Brantley, Wen Sun
NeurIPS 2025 Value-Guided Search for Efficient Chain-of-Thought Reasoning Kaiwen Wang, Jin Peng Zhou, Jonathan Daniel Chang, Zhaolin Gao, Nathan Kallus, Kianté Brantley, Wen Sun
ICLR 2024 Adversarial Imitation Learning via Boosting Jonathan Daniel Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun
ICML 2024 Coactive Learning for Large Language Models Using Implicit User Feedback Aaron David Tucker, Kianté Brantley, Adam Cahall, Thorsten Joachims
NeurIPS 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
ICMLW 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan Daniel Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
ICMLW 2024 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan Daniel Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
ICML 2024 When Is Transfer Learning Possible? My Phan, Kianté Brantley, Stephanie Milani, Soroush Mehri, Gokul Swamy, Geoffrey J. Gordon
ICLR 2023 Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, Yejin Choi
NeurIPSW 2023 Learning to Generate Better than Your LLM Jonathan Chang, Kianté Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun
NeurIPSW 2023 Policy-Gradient Training of Language Models for Ranking Ge Gao, Jonathan Daniel Chang, Claire Cardie, Kianté Brantley, Thorsten Joachims
NeurIPSW 2022 ${lil}$Gym: Natural Language Visual Reasoning with Reinforcement Learning Anne Wu, Kianté Brantley, Noriyuki Kojima, Yoav Artzi
AAAI 2021 Successor Feature Sets: Generalizing Successor Representations Across Policies Kianté Brantley, Soroush Mehri, Geoffrey J. Gordon
NeurIPS 2020 Constrained Episodic Reinforcement Learning in Concave-Convex and Knapsack Settings Kianté Brantley, Miro Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun
ICLR 2020 Disagreement-Regularized Imitation Learning Kiante Brantley, Wen Sun, Mikael Henaff
ICML 2019 Non-Monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Daumé Iii, Kyunghyun Cho
NeurIPS 2019 Reinforcement Learning with Convex Constraints Sobhan Miryoosefi, Kianté Brantley, Hal Daume Iii, Miro Dudik, Robert E. Schapire