Helyar, Alec

2 publications

ICMLW 2024 Rule Based Rewards for Fine-Grained LLM Safety Tong Mu, Alec Helyar, Johannes Heidecke, Joshua Achiam, Andrea Vallone, Ian D Kivlichan, Molly Lin, Alex Beutel, John Schulman, Lilian Weng
NeurIPS 2024 Rule Based Rewards for Language Model Safety Tong Mu, Alec Helyar, Johannes Heidecke, Joshua Achiam, Andrea Vallone, Ian Kivlichan, Molly Lin, Alex Beutel, John Schulman, Lilian Weng