A Universal Representation Transformer Layer for Few-Shot Image Classification

Abstract

Few-shot classification aims to recognize unseen classes when presented with only a small number of samples. We consider the problem of multi-domain few-shot image classification, where unseen classes and examples come from diverse data sources. This problem has seen growing interest and has inspired the development of benchmarks such as Meta-Dataset. A key challenge in this multi-domain setting is to effectively integrate the feature representations from the diverse set of training domains. Here, we propose a Universal Representation Transformer (URT) layer, that meta-learns to leverage universal features for few-shot classification by dynamically re-weighting and composing the most appropriate domain-specific representations. In experiments, we show that URT sets a new state-of-the-art result on Meta-Dataset. Specifically, it achieves top-performance on the highest number of data sources compared to competing methods. We analyze variants of URT and present a visualization of the attention score heatmaps that sheds light on how the model performs cross-domain generalization.

Cite

Text

Liu et al. "A Universal Representation Transformer Layer for Few-Shot Image Classification." International Conference on Learning Representations, 2021.

Markdown

[Liu et al. "A Universal Representation Transformer Layer for Few-Shot Image Classification." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/liu2021iclr-universal/)

BibTeX

@inproceedings{liu2021iclr-universal,
  title     = {{A Universal Representation Transformer Layer for Few-Shot Image Classification}},
  author    = {Liu, Lu and Hamilton, William L. and Long, Guodong and Jiang, Jing and Larochelle, Hugo},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/liu2021iclr-universal/}
}