Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay
Abstract
Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks, which states that last-layer feature vectors for the same class would `collapse' to a single point, while features of different classes become equally separated. We demonstrate that batch normalization (BN) and weight decay (WD) critically influence the emergence of NC. In the near-optimal loss regime, we establish an asymptotic lower bound on the emergence of NC that depends only on the WD value, training loss, and the presence of last-layer BN. Our experiments substantiate theoretical insights by showing that models demonstrate a stronger presence of NC with BN, appropriate WD values and lower loss. Our findings offer a novel perspective in studying the role of BN and WD in shaping neural network features.
Cite
Text
Pan and Cao. "Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay." Transactions on Machine Learning Research, 2026.Markdown
[Pan and Cao. "Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/pan2026tmlr-understanding/)BibTeX
@article{pan2026tmlr-understanding,
title = {{Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay}},
author = {Pan, Leyan and Cao, Xinyuan},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/pan2026tmlr-understanding/}
}