Chanin, David

5 publications

NeurIPS 2025 A Is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders David Chanin, James Wilken-Smith, Tomáš Dulka, Hardik Bhatnagar, Satvik Golechha, Joseph Isaac Bloom

ICML 2025 SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Isaac Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum Stuart Mcdougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, Neel Nanda

NeurIPSW 2024 A Is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders David Chanin, James Wilken-Smith, Tomáš Dulka, Hardik Bhatnagar, Joseph Isaac Bloom

NeurIPS 2024 Analysing the Generalisation and Reliability of Steering Vectors Daniel Tan, David Chanin, Aengus Lynch, Brooks Paige, Dimitrios Kanoulas, Adrià Garriga-Alonso, Robert Kirk

ICMLW 2024 Analyzing the Generalization and Reliability of Steering Vectors Daniel Chee Hian Tan, David Chanin, Aengus Lynch, Adrià Garriga-Alonso, Dimitrios Kanoulas, Brooks Paige, Robert Kirk