Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation
Abstract
Meta-learning for few-shot classification has been challenged on its effectiveness compared to simpler pretraining methods and the validity of its claim of "learning to learn". Recent work has suggested that MAML-based models do not perform "rapid-learning" in the inner-loop but reuse features by only adapting the final linear layer. Separately, BatchNorm, a near ubiquitous inclusion in model architectures, has been shown to have an implicit learning rate decay effect on the preceding layers of a network. We study the impact of BatchNorm's implicit learning rate decay on feature reuse in meta-learning methods and find that counteracting it increases change in intermediate layers during adaptation. We also find that counteracting this learning rate decay sometimes improves performance on few-shot classification tasks.
Cite
Text
Wang et al. "Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation." NeurIPS 2021 Workshops: MetaLearn, 2021.Markdown
[Wang et al. "Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation." NeurIPS 2021 Workshops: MetaLearn, 2021.](https://mlanthology.org/neuripsw/2021/wang2021neuripsw-studying/)BibTeX
@inproceedings{wang2021neuripsw-studying,
title = {{Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation}},
author = {Wang, Alexander and Doubov, Sasha and Leung, Gary},
booktitle = {NeurIPS 2021 Workshops: MetaLearn},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/wang2021neuripsw-studying/}
}