Farrell, Eoin

2 publications

ICML 2025 SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Isaac Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum Stuart Mcdougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, Neel Nanda
NeurIPSW 2024 Applying Sparse Autoencoders to Unlearn Knowledge in Language Models Eoin Farrell, Yeu-Tong Lau, Arthur Conmy