Freedman, Rachel

5 publications

NeurIPSW 2024 Linear Probe Penalties Reduce LLM Sycophancy Henry Papadatos, Rachel Freedman
ICML 2024 Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker
TMLR 2023 Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomek Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphael Segerie, Micah Carroll, Andi Peng, Phillip J.K. Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell
NeurIPSW 2022 The Expertise Problem: Learning from Specialized Feedback Oliver Daniels-Koch, Rachel Freedman
AAAI 2018 Adapting a Kidney Exchange Algorithm to Align with Human Values Rachel Freedman, Jana Schaich Borg, Walter Sinnott-Armstrong, John P. Dickerson, Vincent Conitzer