ML Anthology
Authors
Search
About
Knox, W. Bradley
12 publications
ICLRW
2025
CTRL-Rec: Controlling Recommender Systems with Natural Language
Micah Carroll
,
Adeline Foote
,
Marcus Williams
,
Anca Dragan
,
W. Bradley Knox
,
Smitha Milli
ICLR
2025
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael JQ Zhang
,
W. Bradley Knox
,
Eunsol Choi
NeurIPSW
2024
Analyzing Reward Functions via Trajectory Alignment
Calarina Muslimani
,
Suyog Chandramouli
,
Serena Booth
,
W. Bradley Knox
,
Matthew E. Taylor
ICLR
2024
Contrastive Preference Learning: Learning from Human Feedback Without Reinforcement Learning
Joey Hejna
,
Rafael Rafailov
,
Harshit Sikchi
,
Chelsea Finn
,
Scott Niekum
,
W. Bradley Knox
,
Dorsa Sadigh
AAAI
2024
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W. Bradley Knox
,
Stephane Hatgis-Kessell
,
Sigurdur O. Adalgeirsson
,
Serena Booth
,
Anca D. Dragan
,
Peter Stone
,
Scott Niekum
TMLR
2024
Models of Human Preference for Learning Reward Functions
W. Bradley Knox
,
Stephane Hatgis-Kessell
,
Serena Booth
,
Scott Niekum
,
Peter Stone
,
Alessandro G Allievi
AAAI
2024
Reward (Mis)design for Autonomous Driving (Abstract Reprint)
W. Bradley Knox
,
Alessandro Allievi
,
Holger Banzhaf
,
Felix Schmitt
,
Peter Stone
NeurIPS
2024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
,
Yaswanth Chittepu
,
Ryan Park
,
Harshit Sikchi
,
Joey Hejna
,
W. Bradley Knox
,
Chelsea Finn
,
Scott Niekum
ICMLW
2024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
,
Yaswanth Chittepu
,
Ryan Park
,
Harshit Sikchi
,
Joey Hejna
,
W. Bradley Knox
,
Chelsea Finn
,
Scott Niekum
ICMLW
2023
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W. Bradley Knox
,
Stephane Hatgis-Kessell
,
Sigurdur Orn Adalgeirsson
,
Serena Booth
,
Anca Dragan
,
Peter Stone
,
Scott Niekum
AAAI
2023
The Perils of Trial-and-Error Reward Design: Misdesign Through Overfitting and Invalid Task Specifications
Serena Booth
,
W. Bradley Knox
,
Julie Shah
,
Scott Niekum
,
Peter Stone
,
Alessandro Allievi
AAAI
2021
Demonstration of the EMPATHIC Framework for Task Learning from Implicit Human Feedback
Yuchen Cui
,
Qiping Zhang
,
Sahil Jain
,
Alessandro Allievi
,
Peter Stone
,
Scott Niekum
,
W. Bradley Knox