Hänni, Kaarel

3 publications

ICMLW 2024 Cluster-Norm for Unsupervised Probing of Knowledge Walter Laurito, Sharan Maiya, Grégoire Dhimoïla, Owen Ho Wan Yeung, Kaarel Hänni
ICMLW 2024 Mathematical Models of Computation in Superposition Kaarel Hänni, Jake Mendel, Dmitry Vaintrob, Lawrence Chan
ICMLW 2024 Using Degeneracy in the Loss Landscape for Mechanistic Interpretability Lucius Bushnaq, Jake Mendel, Stefan Heimersheim, Dan Braun, Nicholas Goldowsky-Dill, Kaarel Hänni, Cindy Wu, Marius Hobbhahn