Loss Landscape Degeneracy and Stagewise Development in Transformers
Abstract
Deep learning involves navigating a high-dimensional loss landscape over the neural network parameter space. Over the course of training, complex computational structures form and re-form inside the neural network, leading to shifts in input/output behavior. It is a priority for the science of deep learning to uncover principles governing the development of neural network structure and behavior. Drawing on the framework of singular learning theory, we propose that model development is deeply linked to degeneracy in the local geometry of the loss landscape. We investigate this link by monitoring loss landscape degeneracy throughout training, as quantified by the local learning coefficient, for a transformer language model and an in-context linear regression transformer. We show that training can be divided into distinct periods of change in loss landscape degeneracy, and that these changes in degeneracy coincide with significant changes in the internal computational structure and the input/output behavior of the transformers. This finding provides suggestive evidence that degeneracy and development are linked in transformers, underscoring the potential of a degeneracy-based perspective for understanding modern deep learning.
Cite
Text
Hoogland et al. "Loss Landscape Degeneracy and Stagewise Development in Transformers." Transactions on Machine Learning Research, 2025.Markdown
[Hoogland et al. "Loss Landscape Degeneracy and Stagewise Development in Transformers." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/hoogland2025tmlr-loss/)BibTeX
@article{hoogland2025tmlr-loss,
title = {{Loss Landscape Degeneracy and Stagewise Development in Transformers}},
author = {Hoogland, Jesse and Wang, George and Farrugia-Roberts, Matthew and Carroll, Liam and Wei, Susan and Murfet, Daniel},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/hoogland2025tmlr-loss/}
}