Cunningham, Hoagy

1 publications

ICLR 2024 Sparse Autoencoders Find Highly Interpretable Features in Language Models Robert Huben, Hoagy Cunningham, Logan Riggs Smith, Aidan Ewart, Lee Sharkey