ML Anthology
Authors
Search
About
Williams, Marcus
4 publications
ICLRW
2025
CTRL-Rec: Controlling Recommender Systems with Natural Language
Micah Carroll
,
Adeline Foote
,
Marcus Williams
,
Anca Dragan
,
W. Bradley Knox
,
Smitha Milli
ICLR
2025
On Targeted Manipulation and Deception When Optimizing LLMs for User Feedback
Marcus Williams
,
Micah Carroll
,
Adhyyan Narang
,
Constantin Weisser
,
Brendan Murphy
,
Anca Dragan
ICLR
2024
On the Expressivity of Objective-Specification Formalisms in Reinforcement Learning
Rohan Subramani
,
Marcus Williams
,
Max Heitmann
,
Halfdan Holm
,
Charlie Griffin
,
Joar Max Viktor Skalse
NeurIPSW
2024
Targeted Manipulation and Deception Emerge in LLMs Trained on User* Feedback
Marcus Williams
,
Micah Carroll
,
Constantin Weisser
,
Brendan Murphy
,
Adhyyan Narang
,
Anca Dragan