Efficient Variance Reduction for Meta-Learning

Abstract

Meta-learning tries to learn meta-knowledge from a large number of tasks. However, the stochastic meta-gradient can have large variance due to data sampling (from each task) and task sampling (from the whole task distribution), leading to slow convergence. In this paper, we propose a novel approach that integrates variance reduction with first-order meta-learning algorithms such as Reptile. It retains the bilevel formulation which better captures the structure of meta-learning, but does not require storing the vast number of task-specific parameters in general bilevel variance reduction methods. Theoretical results show that it has fast convergence rate due to variance reduction. Experiments on benchmark few-shot classification data sets demonstrate its effectiveness over state-of-the-art meta-learning algorithms with and without variance reduction.

Cite

Text

Yang and Kwok. "Efficient Variance Reduction for Meta-Learning." International Conference on Machine Learning, 2022.

Markdown

[Yang and Kwok. "Efficient Variance Reduction for Meta-Learning." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/yang2022icml-efficient/)

BibTeX

@inproceedings{yang2022icml-efficient,
  title     = {{Efficient Variance Reduction for Meta-Learning}},
  author    = {Yang, Hansi and Kwok, James},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {25070-25095},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/yang2022icml-efficient/}
}