Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay

Abstract

Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks, which states that last-layer feature vectors for the same class would `collapse' to a single point, while features of different classes become equally separated. We demonstrate that batch normalization (BN) and weight decay (WD) critically influence the emergence of NC. In the near-optimal loss regime, we establish an asymptotic lower bound on the emergence of NC that depends only on the WD value, training loss, and the presence of last-layer BN. Our experiments substantiate theoretical insights by showing that models demonstrate a stronger presence of NC with BN, appropriate WD values and lower loss. Our findings offer a novel perspective in studying the role of BN and WD in shaping neural network features.

Cite

Text

Pan and Cao. "Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay." Transactions on Machine Learning Research, 2026.

Markdown

[Pan and Cao. "Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/pan2026tmlr-understanding/)

BibTeX

@article{pan2026tmlr-understanding,
  title     = {{Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay}},
  author    = {Pan, Leyan and Cao, Xinyuan},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/pan2026tmlr-understanding/}
}