On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization

Abstract

Mean-field theory is widely used in theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions for Gram matrices of hidden representations in deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. It is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN avoids this error amplification with depth. When the chain of hidden representations is rapidly mixing, we establish a concentration bound for a mean-field model of Gram matrices. To our knowledge, this is the first concentration bound that does not become vacuous with depth for standard MLPs with a finite width.

Cite

Text

Joudaki et al. "On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization." International Conference on Machine Learning, 2023.

Markdown

[Joudaki et al. "On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/joudaki2023icml-bridging/)

BibTeX

@inproceedings{joudaki2023icml-bridging,
  title     = {{On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization}},
  author    = {Joudaki, Amir and Daneshmand, Hadi and Bach, Francis},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {15388-15400},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/joudaki2023icml-bridging/}
}