Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation

Abstract

Meta-learning for few-shot classification has been challenged on its effectiveness compared to simpler pretraining methods and the validity of its claim of "learning to learn". Recent work has suggested that MAML-based models do not perform "rapid-learning" in the inner-loop but reuse features by only adapting the final linear layer. Separately, BatchNorm, a near ubiquitous inclusion in model architectures, has been shown to have an implicit learning rate decay effect on the preceding layers of a network. We study the impact of BatchNorm's implicit learning rate decay on feature reuse in meta-learning methods and find that counteracting it increases change in intermediate layers during adaptation. We also find that counteracting this learning rate decay sometimes improves performance on few-shot classification tasks.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Wang et al. "Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation." NeurIPS 2021 Workshops: MetaLearn, 2021.

Markdown

[Wang et al. "Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation." NeurIPS 2021 Workshops: MetaLearn, 2021.](https://mlanthology.org/neuripsw/2021/wang2021neuripsw-studying/)

BibTeX

@inproceedings{wang2021neuripsw-studying,
  title     = {{Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation}},
  author    = {Wang, Alexander and Doubov, Sasha and Leung, Gary},
  booktitle = {NeurIPS 2021 Workshops: MetaLearn},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/wang2021neuripsw-studying/}
}