On Dropout and Nuclear Norm Regularization

Abstract

We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss. We show that (a) the explicit regularizer is composed of an $\ell_2$-path regularizer and other terms that are also re-scaling invariant, (b) the convex envelope of the induced regularizer is the squared nuclear norm of the network map, and (c) for a sufficiently large dropout rate, we characterize the global optima of the dropout objective. We validate our theoretical findings with empirical results.

Cite

Text

Mianjy and Arora. "On Dropout and Nuclear Norm Regularization." International Conference on Machine Learning, 2019.

Markdown

[Mianjy and Arora. "On Dropout and Nuclear Norm Regularization." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/mianjy2019icml-dropout/)

BibTeX

@inproceedings{mianjy2019icml-dropout,
  title     = {{On Dropout and Nuclear Norm Regularization}},
  author    = {Mianjy, Poorya and Arora, Raman},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {4575-4584},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/mianjy2019icml-dropout/}
}