Stochastic Modified Equations and Dynamics of Dropout Algorithm

Abstract

Dropout is a widely utilized regularization technique in the training of neural networks, nevertheless, its underlying mechanism and impact on achieving good generalization abilities remain to be further understood. In this work, we start by undertaking a rigorous theoretical derivation of the stochastic modified equations, with the primary aim of providing an effective approximation for the discrete iterative process of dropout. Meanwhile, we experimentally verify SDE's ability to approximate dropout under a wider range of settings. Subsequently, we empirically delve into the intricate mechanisms by which dropout facilitates the identification of flatter minima. This exploration is conducted through intuitive approximations, exploiting the structural analogies inherent in the Hessian of loss landscape and the covariance of dropout. Our empirical findings substantiate the ubiquitous presence of the Hessian-variance alignment relation throughout the training process of dropout.

Cite

Text

Zhang et al. "Stochastic Modified Equations and Dynamics of Dropout Algorithm." International Conference on Learning Representations, 2024.

Markdown

[Zhang et al. "Stochastic Modified Equations and Dynamics of Dropout Algorithm." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/zhang2024iclr-stochastic/)

BibTeX

@inproceedings{zhang2024iclr-stochastic,
  title     = {{Stochastic Modified Equations and Dynamics of Dropout Algorithm}},
  author    = {Zhang, Zhongwang and Li, Yuqing and Luo, Tao and Xu, Zhi-Qin John},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/zhang2024iclr-stochastic/}
}