Tigges, Curt

7 publications

ICML 2025 SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Isaac Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum Stuart Mcdougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, Neel Nanda

ICLR 2025 Sparse Autoencoders Do Not Find Canonical Units of Analysis Patrick Leask, Bart Bussmann, Michael T Pearce, Joseph Isaac Bloom, Curt Tigges, Noura Al Moubayed, Lee Sharkey, Neel Nanda

NeurIPS 2024 LLM Circuit Analyses Are Consistent Across Training and Scale Curt Tigges, Michael Hanna, Qinan Yu, Stella Biderman

ICMLW 2024 LLM Circuit Analyses Are Consistent Across Training and Scale Curt Tigges, Michael Hanna, Qinan Yu, Stella Biderman

ICMLW 2024 Language Models Linearly Represent Sentiment Curt Tigges, Oskar John Hollinsworth, Atticus Geiger, Neel Nanda

NeurIPSW 2024 Stitching Sparse Autoencoders of Different Sizes Patrick Leask, Bart Bussmann, Joseph Isaac Bloom, Curt Tigges, Noura Al Moubayed, Neel Nanda

TMLR 2024 Transformer-Based Models Are Not yet Perfect at Learning to Emulate Structural Recursion Dylan Zhang, Curt Tigges, Zory Zhang, Stella Biderman, Maxim Raginsky, Talia Ringer