Leslie, Sarah-Jane

2 publications

NeurIPS 2025 Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers Andrew Joohun Nam, Henry Conklin, Yukang Yang, Thomas L. Griffiths, Jonathan D. Cohen, Sarah-Jane Leslie
ICLRW 2025 Understanding Task Representations in Neural Networks via Bayesian Ablation Andrew Joohun Nam, Declan Iain Campbell, Thomas L. Griffiths, Jonathan D. Cohen, Sarah-Jane Leslie