Huben, Robert

2 publications

ICLR 2024 Sparse Autoencoders Find Highly Interpretable Features in Language Models Robert Huben, Hoagy Cunningham, Logan Riggs Smith, Aidan Ewart, Lee Sharkey

NeurIPSW 2023 Attention-Only Transformers and Implementing MLPs with Attention Heads Robert Huben, Valerie Morris