Barnhart, Logan

1 publications

NeurIPSW 2024 Aligning to What? Limits to RLHF Based Alignment Logan Barnhart, Reza Akbarian Bafghi, Maziar Raissi, Stephen Becker