VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments

Yu-Yun Tseng, Alexander Bell, Danna Gurari

ECCV 2022

doi:10.1007/978-3-031-20074-8_33 /eccv/2022/tseng2022eccv-vizwizfewshot/

Abstract

We introduce a few-shot localization dataset originating from photographers who authentically were trying to learn about the visual content in the images they took. It includes over 8,000 segmentations of 100 categories in over 4,000 images that were taken by people with visual impairments. Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the first to locate holes in objects (e.g., found in 12.4% of our segmentations), it shows objects that occupy a much larger range of sizes relative to the images, and text is over five times more common in our objects (e.g., found in 24.7% of our segmentations). Analysis of two modern few-shot localization algorithms demonstrates that they generalize poorly to our new dataset. The algorithms commonly struggle to locate objects with holes, very small and very large objects, and objects lacking text. To encourage a larger community to work on these unsolved challenges, we publicly share our annotated few-shot dataset at http://anonymous.

PDF ECCV Semantic Scholar

Cite

Text

Tseng et al. "VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20074-8_33

Markdown

[Tseng et al. "VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/tseng2022eccv-vizwizfewshot/) doi:10.1007/978-3-031-20074-8_33

BibTeX

@inproceedings{tseng2022eccv-vizwizfewshot,
  title     = {{VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments}},
  author    = {Tseng, Yu-Yun and Bell, Alexander and Gurari, Danna},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-20074-8_33},
  url       = {https://mlanthology.org/eccv/2022/tseng2022eccv-vizwizfewshot/}
}