Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units

Abstract

This paper presents a general framework for norm-based capacity control for $L_{p,q}$ weight normalized deep neural networks. We establish the upper bound on the Rademacher complexities of this family. With an $L_{p,q}$ normalization where $q\le p^*$ and $1/p+1/p^{*}=1$, we discuss properties of a width-independent capacity control, which only depends on the depth by a square root term. We further analyze the approximation properties of $L_{p,q}$ weight normalized deep neural networks. In particular, for an $L_{1,\infty}$ weight normalized network, the approximation error can be controlled by the $L_1$ norm of the output layer, and the corresponding generalization error only depends on the architecture by the square root of the depth.

Cite

Text

Xu and Wang. "Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units." Neural Information Processing Systems, 2018.

Markdown

[Xu and Wang. "Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/xu2018neurips-understanding/)

BibTeX

@inproceedings{xu2018neurips-understanding,
  title     = {{Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units}},
  author    = {Xu, Yixi and Wang, Xiao},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {130-139},
  url       = {https://mlanthology.org/neurips/2018/xu2018neurips-understanding/}
}