GEMS: Generating Efficient Meta-Subnets
Abstract
Gradient-based meta learners (GBML) such as MAML aim to learn a model initialization across similar tasks, such that the model generalizes well on unseen tasks sampled from the same distribution with few gradient updates. A limitation of GBML is its inability to adapt to real-world applications where input tasks are sampled from multiple distributions. An existing effort learns N initializations for tasks sampled from N distributions; roughly increasing training time by a factor of N. Instead, we use a single model initialization to learn distribution-specific parameters for every input task. This reduces negative knowledge transfer across distributions and overall computational cost. Specifically, we explore two ways of efficiently learning on multi-distribution tasks: 1) Binary Mask Perceptron (BMP), which learns distribution-specific layers, 2) Multi-modal Supermask (MMSUP), which learns distribution-specific parameters. We evaluate the performance of the proposed framework (GEMS) on few-shot vision classification tasks. The experimental results demonstrate a significant improvement in accuracy and reduction in training time over existing state of the art algorithms on quasi-benchmark tasks.
Cite
Text
Pimpalkhute et al. "GEMS: Generating Efficient Meta-Subnets." Winter Conference on Applications of Computer Vision, 2023.Markdown
[Pimpalkhute et al. "GEMS: Generating Efficient Meta-Subnets." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/pimpalkhute2023wacv-gems/)BibTeX
@inproceedings{pimpalkhute2023wacv-gems,
title = {{GEMS: Generating Efficient Meta-Subnets}},
author = {Pimpalkhute, Varad and Kunde, Shruti and Singhal, Rekha},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2023},
pages = {5315-5323},
url = {https://mlanthology.org/wacv/2023/pimpalkhute2023wacv-gems/}
}