ML Anthology
Authors
Search
About
Brantley, Kiante
21 publications
NeurIPS
2025
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
Jin Peng Zhou
,
Kaiwen Wang
,
Jonathan Daniel Chang
,
Zhaolin Gao
,
Nathan Kallus
,
Kilian Q Weinberger
,
Kianté Brantley
,
Wen Sun
NeurIPS
2025
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
Kianté Brantley
,
Mingyu Chen
,
Zhaolin Gao
,
Jason D. Lee
,
Wen Sun
,
Wenhao Zhan
,
Xuezhou Zhang
ICLR
2025
Diffusing States and Matching Scores: A New Framework for Imitation Learning
Runzhe Wu
,
Yiding Chen
,
Gokul Swamy
,
Kianté Brantley
,
Wen Sun
ICLR
2025
Regressing the Relative Future: Efficient Policy Optimization for Multi-Turn RLHF
Zhaolin Gao
,
Wenhao Zhan
,
Jonathan Daniel Chang
,
Gokul Swamy
,
Kianté Brantley
,
Jason D. Lee
,
Wen Sun
NeurIPS
2025
Scaling Offline RL via Efficient and Expressive Shortcut Models
Nicolas Espinosa-Dice
,
Yiyi Zhang
,
Yiding Chen
,
Bradley Guo
,
Owen Oertell
,
Gokul Swamy
,
Kianté Brantley
,
Wen Sun
NeurIPS
2025
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang
,
Jin Peng Zhou
,
Jonathan Daniel Chang
,
Zhaolin Gao
,
Nathan Kallus
,
Kianté Brantley
,
Wen Sun
ICLR
2024
Adversarial Imitation Learning via Boosting
Jonathan Daniel Chang
,
Dhruv Sreenivas
,
Yingbing Huang
,
Kianté Brantley
,
Wen Sun
ICML
2024
Coactive Learning for Large Language Models Using Implicit User Feedback
Aaron David Tucker
,
Kianté Brantley
,
Adam Cahall
,
Thorsten Joachims
NeurIPS
2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
,
Jonathan D. Chang
,
Wenhao Zhan
,
Owen Oertell
,
Gokul Swamy
,
Kianté Brantley
,
Thorsten Joachims
,
J. Andrew Bagnell
,
Jason D. Lee
,
Wen Sun
ICMLW
2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
,
Jonathan Daniel Chang
,
Wenhao Zhan
,
Owen Oertell
,
Gokul Swamy
,
Kianté Brantley
,
Thorsten Joachims
,
J. Andrew Bagnell
,
Jason D. Lee
,
Wen Sun
ICMLW
2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
,
Jonathan Daniel Chang
,
Wenhao Zhan
,
Owen Oertell
,
Gokul Swamy
,
Kianté Brantley
,
Thorsten Joachims
,
J. Andrew Bagnell
,
Jason D. Lee
,
Wen Sun
ICML
2024
When Is Transfer Learning Possible?
My Phan
,
Kianté Brantley
,
Stephanie Milani
,
Soroush Mehri
,
Gokul Swamy
,
Geoffrey J. Gordon
ICLR
2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
,
Prithviraj Ammanabrolu
,
Kianté Brantley
,
Jack Hessel
,
Rafet Sifa
,
Christian Bauckhage
,
Hannaneh Hajishirzi
,
Yejin Choi
NeurIPSW
2023
Learning to Generate Better than Your LLM
Jonathan Chang
,
Kianté Brantley
,
Rajkumar Ramamurthy
,
Dipendra Misra
,
Wen Sun
NeurIPSW
2023
Policy-Gradient Training of Language Models for Ranking
Ge Gao
,
Jonathan Daniel Chang
,
Claire Cardie
,
Kianté Brantley
,
Thorsten Joachims
NeurIPSW
2022
${lil}$Gym: Natural Language Visual Reasoning with Reinforcement Learning
Anne Wu
,
Kianté Brantley
,
Noriyuki Kojima
,
Yoav Artzi
AAAI
2021
Successor Feature Sets: Generalizing Successor Representations Across Policies
Kianté Brantley
,
Soroush Mehri
,
Geoffrey J. Gordon
NeurIPS
2020
Constrained Episodic Reinforcement Learning in Concave-Convex and Knapsack Settings
Kianté Brantley
,
Miro Dudik
,
Thodoris Lykouris
,
Sobhan Miryoosefi
,
Max Simchowitz
,
Aleksandrs Slivkins
,
Wen Sun
ICLR
2020
Disagreement-Regularized Imitation Learning
Kiante Brantley
,
Wen Sun
,
Mikael Henaff
ICML
2019
Non-Monotonic Sequential Text Generation
Sean Welleck
,
Kianté Brantley
,
Hal Daumé Iii
,
Kyunghyun Cho
NeurIPS
2019
Reinforcement Learning with Convex Constraints
Sobhan Miryoosefi
,
Kianté Brantley
,
Hal Daume Iii
,
Miro Dudik
,
Robert E. Schapire