ML Anthology
Authors
Search
About
Subkhankulov, Marat
1 publications
ICLR
2026
Small Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and Implications for Mechanistic Interpretability
Luca Baroni
,
Galvin Khara
,
Joachim Schaeffer
,
Marat Subkhankulov
,
Stefan Heimersheim