Prototype-Based Dataset Comparison

Abstract

Dataset summarisation is a fruitful approach to dataset inspection. However, when applied to a single dataset the discovery of visual concepts is restricted to those most prominent. We argue that a comparative approach can expand upon this paradigm to enable richer forms of dataset inspection that go beyond the most prominent concepts. To enable dataset comparison we present a module that learns concept-level prototypes across datasets. We leverage self-supervised learning to discover these prototypes without supervision, and we demonstrate the benefits of our approach in two case-studies. Our findings show that dataset comparison extends dataset inspection and we hope to encourage more works in this direction. Code and usage instructions available at https://github.com/Nanne/ProtoSim

Cite

Text

van Noord. "Prototype-Based Dataset Comparison." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00186

Markdown

[van Noord. "Prototype-Based Dataset Comparison." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/vannoord2023iccv-prototypebased/) doi:10.1109/ICCV51070.2023.00186

BibTeX

@inproceedings{vannoord2023iccv-prototypebased,
  title     = {{Prototype-Based Dataset Comparison}},
  author    = {van Noord, Nanne},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {1944-1954},
  doi       = {10.1109/ICCV51070.2023.00186},
  url       = {https://mlanthology.org/iccv/2023/vannoord2023iccv-prototypebased/}
}