Time-Independent Generalization Bounds for SGLD in Non-Convex Settings

Abstract

We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.

Cite

Text

Farghly and Rebeschini. "Time-Independent Generalization Bounds for SGLD in Non-Convex Settings." Neural Information Processing Systems, 2021.

Markdown

[Farghly and Rebeschini. "Time-Independent Generalization Bounds for SGLD in Non-Convex Settings." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/farghly2021neurips-timeindependent/)

BibTeX

@inproceedings{farghly2021neurips-timeindependent,
  title     = {{Time-Independent Generalization Bounds for SGLD in Non-Convex Settings}},
  author    = {Farghly, Tyler and Rebeschini, Patrick},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/farghly2021neurips-timeindependent/}
}