On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization
Abstract
Mean-field theory is widely used in theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions for Gram matrices of hidden representations in deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. It is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN avoids this error amplification with depth. When the chain of hidden representations is rapidly mixing, we establish a concentration bound for a mean-field model of Gram matrices. To our knowledge, this is the first concentration bound that does not become vacuous with depth for standard MLPs with a finite width.
Cite
Text
Joudaki et al. "On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization." International Conference on Machine Learning, 2023.Markdown
[Joudaki et al. "On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/joudaki2023icml-bridging/)BibTeX
@inproceedings{joudaki2023icml-bridging,
title = {{On Bridging the Gap Between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization}},
author = {Joudaki, Amir and Daneshmand, Hadi and Bach, Francis},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {15388-15400},
volume = {202},
url = {https://mlanthology.org/icml/2023/joudaki2023icml-bridging/}
}