Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Wang, Hao; Huang, Yizhe; Gao, Rui; Calmon, Flavio

Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Hao Wang, Yizhe Huang, Rui Gao, Flavio Calmon

NeurIPS 2021

/neurips/2021/wang2021neurips-analyzing/

Abstract

Optimization is a key component for training machine learning models and has a strong impact on their generalization. In this paper, we consider a particular optimization method---the stochastic gradient Langevin dynamics (SGLD) algorithm---and investigate the generalization of models trained by SGLD. We derive a new generalization bound by connecting SGLD with Gaussian channels found in information and communication theory. Our bound can be computed from the training data and incorporates the variance of gradients for quantifying a particular kind of "sharpness" of the loss landscape. We also consider a closely related algorithm with SGLD, namely differentially private SGD (DP-SGD). We prove that the generalization capability of DP-SGD can be amplified by iteration. Specifically, our bound can be sharpened by including a time-decaying factor if the DP-SGD algorithm outputs the last iterate while keeping other iterates hidden. This decay factor enables the contribution of early iterations to our bound to reduce with time and is established by strong data processing inequalities---a fundamental tool in information theory. We demonstrate our bound through numerical experiments, showing that it can predict the behavior of the true generalization gap.

PDF NeurIPS OpenReview Code Semantic Scholar

Cite

Text

Wang et al. "Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels." Neural Information Processing Systems, 2021.

Markdown

[Wang et al. "Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/wang2021neurips-analyzing/)

BibTeX

@inproceedings{wang2021neurips-analyzing,
  title     = {{Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels}},
  author    = {Wang, Hao and Huang, Yizhe and Gao, Rui and Calmon, Flavio},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/wang2021neurips-analyzing/}
}