Few-Shot Learning with Localization in Realistic Settings
Abstract
Traditional recognition methods typically require large, artificially-balanced training classes, while few-shot learning methods are tested on artificially small ones. In contrast to both extremes, real world recognition problems exhibit heavy-tailed class distributions, with cluttered scenes and a mix of coarse and fine-grained class distinctions. We show that prior methods designed for few-shot learning do not work out of the box in these challenging conditions, based on a new "meta-iNat" benchmark. We introduce three parameter-free improvements: (a) better training procedures based on adapting cross-validation to meta-learning, (b) novel architectures that localize objects using limited bounding box annotations before classification, and (c) simple parameter-free expansions of the feature space based on bilinear pooling. Together, these improvements double the accuracy of state-of-the-art models on meta-iNat while generalizing to prior benchmarks, complex neural architectures, and settings with substantial domain shift.
Cite
Text
Wertheimer and Hariharan. "Few-Shot Learning with Localization in Realistic Settings." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00672Markdown
[Wertheimer and Hariharan. "Few-Shot Learning with Localization in Realistic Settings." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/wertheimer2019cvpr-fewshot/) doi:10.1109/CVPR.2019.00672BibTeX
@inproceedings{wertheimer2019cvpr-fewshot,
title = {{Few-Shot Learning with Localization in Realistic Settings}},
author = {Wertheimer, Davis and Hariharan, Bharath},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2019},
doi = {10.1109/CVPR.2019.00672},
url = {https://mlanthology.org/cvpr/2019/wertheimer2019cvpr-fewshot/}
}