ML Anthology
Authors
Search
About
Pfohl, Stephen Robert
2 publications
NeurIPSW
2023
Reward Model Underspecification in Language Model Alignment
Jacob Eisenstein
,
Jonathan Berant
,
Chirag Nagpal
,
Alekh Agarwal
,
Ahmad Beirami
,
Alexander Nicholas D'Amour
,
Krishnamurthy Dj Dvijotham
,
Katherine A Heller
,
Stephen Robert Pfohl
,
Deepak Ramachandran
NeurIPSW
2023
Understanding Subgroup Performance Differences of Fair Predictors Using Causal Models
Stephen Robert Pfohl
,
Natalie Harris
,
Chirag Nagpal
,
David Madras
,
Vishwali Mhasawade
,
Olawale Elijah Salaudeen
,
Katherine A Heller
,
Sanmi Koyejo
,
Alexander Nicholas D'Amour