ML Anthology
Authors
Search
About
Dumbalska, Tsvetomira
1 publications
ICLR
2026
Reward Models Inherit Value Biases from Pretraining
Brian Christian
,
Jessica A F Thompson
,
Elle
,
Vincent Adam
,
Hannah Rose Kirk
,
Christopher Summerfield
,
Tsvetomira Dumbalska