Optimization as a Model for Few-Shot Learning

Abstract

Though deep neural networks have shown great success in the large data domain, they generally perform poorly on few-shot learning tasks, where a model has to quickly generalize after seeing very few examples from each class. The general belief is that gradient-based optimization in high capacity models requires many iterative steps over many examples to perform well. Here, we propose an LSTM-based meta-learner model to learn the exact optimization algorithm used to train another learner neural network in the few-shot regime. The parametrization of our model allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while also learning a general initialization of the learner network that allows for quick convergence of training. We demonstrate that this meta-learning model is competitive with deep metric-learning techniques for few-shot learning.

Cite

Text

Ravi and Larochelle. "Optimization as a Model for Few-Shot Learning." International Conference on Learning Representations, 2017.

Markdown

[Ravi and Larochelle. "Optimization as a Model for Few-Shot Learning." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/ravi2017iclr-optimization/)

BibTeX

@inproceedings{ravi2017iclr-optimization,
  title     = {{Optimization as a Model for Few-Shot Learning}},
  author    = {Ravi, Sachin and Larochelle, Hugo},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/ravi2017iclr-optimization/}
}