Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension
Abstract
We introduce a recently developed framework PDE Acceleration, which is a variational approach to accelerated optimization with partial differential equations (PDE), in the context of optimization of deep networks. We derive the PDE evolution equations for optimization of general loss functions using this variational approach. We propose discretizations of these PDE based on numerical PDE discretizations, and establish a mapping between these discretizations and stochastic gradient descent (SGD). We show that our framework can give rise to new PDEs that can be mapped to new optimization algorithms, and thus theoretical insights from the PDE domain can be used to analyze optimization algorithms. We show an example by introducing a new PDE with diffusion that naturally arises from the viscosity solution, which translates to a novel extension of SGD. We analytically analyze the stability and convergence using Von-Neumann analysis. We apply the proposed extension to optimization of convolutional neural networks (CNNs). We empirically validate the theory and evaluate our new extension on image classification showing empirical improvement over SGD.
Cite
Text
Sun et al. "Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension." NeurIPS 2021 Workshops: DLDE, 2021.Markdown
[Sun et al. "Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension." NeurIPS 2021 Workshops: DLDE, 2021.](https://mlanthology.org/neuripsw/2021/sun2021neuripsw-accelerated/)BibTeX
@inproceedings{sun2021neuripsw-accelerated,
title = {{Accelerated PDEs for Construction and Theoretical Analysis of an SGD Extension}},
author = {Sun, Yuxin and Lao, Dong and Sundaramoorthi, Ganesh and Yezzi, Anthony},
booktitle = {NeurIPS 2021 Workshops: DLDE},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/sun2021neuripsw-accelerated/}
}