Marks, Luke

1 publications

NeurIPS 2024 Interpreting Learned Feedback Patterns in Large Language Models Luke Marks, Amir Abdullah, Clement Neo, Rauno Arike, David Krueger, Philip Torr, Fazl Barez