ML Anthology
Authors
Search
About
Kivlichan, Ian
1 publications
NeurIPS
2024
Rule Based Rewards for Language Model Safety
Tong Mu
,
Alec Helyar
,
Johannes Heidecke
,
Joshua Achiam
,
Andrea Vallone
,
Ian Kivlichan
,
Molly Lin
,
Alex Beutel
,
John Schulman
,
Lilian Weng