Efficient Variance Reduction for Meta-Learning
Abstract
Meta-learning tries to learn meta-knowledge from a large number of tasks. However, the stochastic meta-gradient can have large variance due to data sampling (from each task) and task sampling (from the whole task distribution), leading to slow convergence. In this paper, we propose a novel approach that integrates variance reduction with first-order meta-learning algorithms such as Reptile. It retains the bilevel formulation which better captures the structure of meta-learning, but does not require storing the vast number of task-specific parameters in general bilevel variance reduction methods. Theoretical results show that it has fast convergence rate due to variance reduction. Experiments on benchmark few-shot classification data sets demonstrate its effectiveness over state-of-the-art meta-learning algorithms with and without variance reduction.
Cite
Text
Yang and Kwok. "Efficient Variance Reduction for Meta-Learning." International Conference on Machine Learning, 2022.Markdown
[Yang and Kwok. "Efficient Variance Reduction for Meta-Learning." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/yang2022icml-efficient/)BibTeX
@inproceedings{yang2022icml-efficient,
title = {{Efficient Variance Reduction for Meta-Learning}},
author = {Yang, Hansi and Kwok, James},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {25070-25095},
volume = {162},
url = {https://mlanthology.org/icml/2022/yang2022icml-efficient/}
}