An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset
Abstract
We describe a new image dataset, the Egocentric, Manual, Multi-Image (EMMI) dataset, collected to enable the study of how appearance-related and distributional properties of visual experience affect learning outcomes. Images in EMMI come from first-person, wearable camera recordings of common household objects and toys being manually manipulated to undergo structured transformations like rotation and translation. We also present results from initial experiments, using deep convolutional neural networks, that begin to examine how different distributions of training data can affect visual object recognition, and how the representation of properties like rotation invariance can be studied in novel ways using the unique properties of EMMI.
Cite
Text
Wang et al. "An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.279Markdown
[Wang et al. "An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/wang2017iccvw-object/) doi:10.1109/ICCVW.2017.279BibTeX
@inproceedings{wang2017iccvw-object,
title = {{An Object Is Worth Six Thousand Pictures: The Egocentric, Manual, Multi-Image (EMMI) Dataset}},
author = {Wang, Xiaohan and Eliott, Fernanda Monteiro and Ainooson, James and Palmer, Joshua H. and Kunda, Maithilee},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2017},
pages = {2364-2372},
doi = {10.1109/ICCVW.2017.279},
url = {https://mlanthology.org/iccvw/2017/wang2017iccvw-object/}
}