The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World

Abstract

It is crucial that image datasets for computer vision are representative and contain accurate demographic information to ensure their robustness and fairness, especially for smaller subpopulations. To address this issue, we present Dollar Street - a supervised dataset that contains 38,479 images of everyday household items from homes around the world. This dataset was manually curated and fully labeled, including tags for objects (e.g. “toilet,” “toothbrush,” “stove”) and demographic data such as region, country and home monthly income. This dataset includes images from homes with no internet access and incomes as low as \$26.99 per month, visually capturing valuable socioeconomic diversity of traditionally under-represented populations. All images and data are licensed under CC-BY, permitting their use in academic and commercial work. Moreover, we show that this dataset can improve the performance of classification tasks for images of household items from lower income homes, addressing a critical need for datasets that combat bias.

Cite

Text

Rojas et al. "The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World." Neural Information Processing Systems, 2022.

Markdown

[Rojas et al. "The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/rojas2022neurips-dollar/)

BibTeX

@inproceedings{rojas2022neurips-dollar,
  title     = {{The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World}},
  author    = {Rojas, William Gaviria and Diamos, Sudnya and Kini, Keertan and Kanter, David and Reddi, Vijay Janapa and Coleman, Cody},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/rojas2022neurips-dollar/}
}