Implicit Bias of Gradient Descent on Linear Convolutional Networks

Abstract

We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear SVM solution, regardless of depth.

Cite

Text

Gunasekar et al. "Implicit Bias of Gradient Descent on Linear Convolutional Networks." Neural Information Processing Systems, 2018.

Markdown

[Gunasekar et al. "Implicit Bias of Gradient Descent on Linear Convolutional Networks." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/gunasekar2018neurips-implicit/)

BibTeX

@inproceedings{gunasekar2018neurips-implicit,
  title     = {{Implicit Bias of Gradient Descent on Linear Convolutional Networks}},
  author    = {Gunasekar, Suriya and Lee, Jason and Soudry, Daniel and Srebro, Nati},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {9461-9471},
  url       = {https://mlanthology.org/neurips/2018/gunasekar2018neurips-implicit/}
}