Generalization Bounds for Meta-Learning: An Information-Theoretic Analysis

Abstract

We derive a novel information-theoretic analysis of the generalization property of meta-learning algorithms. Concretely, our analysis proposes a generic understanding in both the conventional learning-to-learn framework \citep{amit2018meta} and the modern model-agnostic meta-learning (MAML) algorithms \citep{finn2017model}.Moreover, we provide a data-dependent generalization bound for the stochastic variant of MAML, which is \emph{non-vacuous} for deep few-shot learning. As compared to previous bounds that depend on the square norms of gradients, empirical validations on both simulated data and a well-known few-shot benchmark show that our bound is orders of magnitude tighter in most conditions.

Cite

Text

Chen et al. "Generalization Bounds for Meta-Learning: An Information-Theoretic Analysis." Neural Information Processing Systems, 2021.

Markdown

[Chen et al. "Generalization Bounds for Meta-Learning: An Information-Theoretic Analysis." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/chen2021neurips-generalization/)

BibTeX

@inproceedings{chen2021neurips-generalization,
  title     = {{Generalization Bounds for Meta-Learning: An Information-Theoretic Analysis}},
  author    = {Chen, Qi and Shui, Changjian and Marchand, Mario},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/chen2021neurips-generalization/}
}