A Sample Complexity Separation Between Non-Convex and Convex Meta-Learning

Nikunj Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora

ICML 2020 pp. 8512-8521

/icml/2020/saunshi2020icml-sample/

Abstract

One popular trend in meta-learning is to learn from many training tasks a common initialization that a gradient-based method can use to solve a new task with few samples. The theory of meta-learning is still in its early stages, with several recent learning-theoretic analyses of methods such as Reptile [Nichol et al., 2018] being for \emph{convex models}. This work shows that convex-case analysis might be insufficient to understand the success of meta-learning, and that even for non-convex models it is important to look inside the optimization black-box, specifically at properties of the optimization trajectory. We construct a simple meta-learning instance that captures the problem of one-dimensional subspace learning. For the convex formulation of linear regression on this instance, we show that the new task sample complexity of any \emph{initialization-based meta-learning} algorithm is $\Omega(d)$, where $d$ is the input dimension. In contrast, for the non-convex formulation of a two layer linear network on the same instance, we show that both Reptile and multi-task representation learning can have new task sample complexity of $O(1)$, demonstrating a separation from convex meta-learning. Crucially, analyses of the training dynamics of these methods reveal that they can meta-learn the correct subspace onto which the data should be projected.

PDF ICML Semantic Scholar

Cite

Text

Saunshi et al. "A Sample Complexity Separation Between Non-Convex and Convex Meta-Learning." International Conference on Machine Learning, 2020.

Markdown

[Saunshi et al. "A Sample Complexity Separation Between Non-Convex and Convex Meta-Learning." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/saunshi2020icml-sample/)

BibTeX

@inproceedings{saunshi2020icml-sample,
  title     = {{A Sample Complexity Separation Between Non-Convex and Convex Meta-Learning}},
  author    = {Saunshi, Nikunj and Zhang, Yi and Khodak, Mikhail and Arora, Sanjeev},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {8512-8521},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/saunshi2020icml-sample/}
}