Entropy-SGD Optimizes the Prior of a PAC-Bayes Bound: Generalization Properties of Entropy-SGD and Data-Dependent Priors

Abstract

We show that Entropy-SGD (Chaudhari et al., 2017), when viewed as a learning algorithm, optimizes a PAC-Bayes bound on the risk of a Gibbs (posterior) classifier, i.e., a randomized classifier obtained by a risk-sensitive perturbation of the weights of a learned classifier. Entropy-SGD works by optimizing the bound’s prior, violating the hypothesis of the PAC-Bayes theorem that the prior is chosen independently of the data. Indeed, available implementations of Entropy-SGD rapidly obtain zero training error on random labels and the same holds of the Gibbs posterior. In order to obtain a valid generalization bound, we rely on a result showing that data-dependent priors obtained by stochastic gradient Langevin dynamics (SGLD) yield valid PAC-Bayes bounds provided the target distribution of SGLD is eps-differentially private. We observe that test error on MNIST and CIFAR10 falls within the (empirically nonvacuous) risk bounds computed under the assumption that SGLD reaches stationarity. In particular, Entropy-SGLD can be configured to yield relatively tight generalization bounds and still fit real labels, although these same settings do not obtain state-of-the-art performance.

Cite

Text

Dziugaite and Roy. "Entropy-SGD Optimizes the Prior of a PAC-Bayes Bound: Generalization Properties of Entropy-SGD and Data-Dependent Priors." International Conference on Machine Learning, 2018.

Markdown

[Dziugaite and Roy. "Entropy-SGD Optimizes the Prior of a PAC-Bayes Bound: Generalization Properties of Entropy-SGD and Data-Dependent Priors." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/dziugaite2018icml-entropysgd/)

BibTeX

@inproceedings{dziugaite2018icml-entropysgd,
  title     = {{Entropy-SGD Optimizes the Prior of a PAC-Bayes Bound: Generalization Properties of Entropy-SGD and Data-Dependent Priors}},
  author    = {Dziugaite, Gintare Karolina and Roy, Daniel},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {1377-1386},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/dziugaite2018icml-entropysgd/}
}