ML Anthology
Authors
Search
About
Lange, Georg
3 publications
ICLR
2025
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov
,
Georg Lange
,
Neel Nanda
ICLR
2024
Is This the Subspace You Are Looking for? an Interpretability Illusion for Subspace Activation Patching
Aleksandar Makelov
,
Georg Lange
,
Atticus Geiger
,
Neel Nanda
ICLRW
2024
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov
,
Georg Lange
,
Neel Nanda