Meta-Curvature

Abstract

We propose meta-curvature (MC), a framework to learn curvature information for better generalization and fast model adaptation. MC expands on the model-agnostic meta-learner (MAML) by learning to transform the gradients in the inner optimization such that the transformed gradients achieve better generalization performance to a new task. For training large scale neural networks, we decompose the curvature matrix into smaller matrices in a novel scheme where we capture the dependencies of the model's parameters with a series of tensor products. We demonstrate the effects of our proposed method on several few-shot learning tasks and datasets. Without any task specific techniques and architectures, the proposed method achieves substantial improvement upon previous MAML variants and outperforms the recent state-of-the-art methods. Furthermore, we observe faster convergence rates of the meta-training process. Finally, we present an analysis that explains better generalization performance with the meta-trained curvature.

Cite

Text

Park and Oliva. "Meta-Curvature." Neural Information Processing Systems, 2019.

Markdown

[Park and Oliva. "Meta-Curvature." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/park2019neurips-metacurvature/)

BibTeX

@inproceedings{park2019neurips-metacurvature,
  title     = {{Meta-Curvature}},
  author    = {Park, Eunbyung and Oliva, Junier B},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {3314-3324},
  url       = {https://mlanthology.org/neurips/2019/park2019neurips-metacurvature/}
}