ML Anthology
Authors
Search
About
Metcalf, Katherine
7 publications
ICML
2025
Aligning LLMs by Predicting Preferences from User Writing Samples
Stéphane Aroca-Ouellette
,
Natalie Mackraz
,
Barry-John Theobald
,
Katherine Metcalf
ICML
2025
Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs
Yinong Oliver Wang
,
Nivedha Sivakumar
,
Falaah Arif Khan
,
Katherine Metcalf
,
Adam Golinski
,
Natalie Mackraz
,
Barry-John Theobald
,
Luca Zappella
,
Nicholas Apostoloff
AAAI
2024
Can You Rely on Synthetic Labellers in Preference-Based Reinforcement Learning? It's Complicated
Katherine Metcalf
,
Miguel Sarabia
,
Masha Fedzechkina
,
Barry-John Theobald
ICLR
2024
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
,
Katherine Metcalf
ICML
2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Xavier Suau
,
Pieter Delobelle
,
Katherine Metcalf
,
Armand Joulin
,
Nicholas Apostoloff
,
Luca Zappella
,
Pau Rodriguez
CoRL
2023
Sample-Efficient Preference-Based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
,
Miguel Sarabia
,
Natalie Mackraz
,
Barry-John Theobald
IJCAI
2019
Unsupervised Hierarchical Temporal Abstraction by Simultaneously Learning Expectations and Representations
Katherine Metcalf
,
David Leake