Nymeria: A Massive Collection of Egocentric Multi-Modal Human Motion in the Wild
Abstract
We introduce - a large-scale, diverse, richly annotated human motion dataset collected in the wild with multiple multimodal egocentric devices. The dataset comes with a) full-body ground-truth motion; b) multiple multimodal egocentric data from Project Aria devices with videos, eye tracking, IMUs and etc; and c) an third-person perspective by an additional “observer”. All devices are precisely synchronized and localized in one metric 3D world. We derive hierarchical protocol to add in-context language descriptions of human motion, from fine-grain motion narrations, to simplified atomic actions and high-level activity summarization. To the best of our knowledge, dataset is the world’s largest human motion in the wild; first of its kind to provide synchronized and localized multi-device multimodal egocentric data; and the world’s largest motion-language dataset. It provides hours of daily activities from participants across locations, total travelling distance over . The language descriptions contain sentences in words from a vocabulary size of 6545. To demonstrate the potential of the dataset we evaluate several SOTA algorithms for egocentric body tracking, motion synthesis, and action recognition.
Cite
Text
Ma et al. "Nymeria: A Massive Collection of Egocentric Multi-Modal Human Motion in the Wild." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72691-0_25Markdown
[Ma et al. "Nymeria: A Massive Collection of Egocentric Multi-Modal Human Motion in the Wild." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/ma2024eccv-nymeria/) doi:10.1007/978-3-031-72691-0_25BibTeX
@inproceedings{ma2024eccv-nymeria,
title = {{Nymeria: A Massive Collection of Egocentric Multi-Modal Human Motion in the Wild}},
author = {Ma, Lingni and Ye, Yuting and Postyeni, Rowan and Gamino, Alexander J and Baiyya, Vijay and Pesqueira, Luis and Bailey, Kevin M and Fosas, David Soriano and Hong, Fangzhou and Guzov, Vladimir and Jiang, Yifeng and Kim, Hyo Jin and Engel, Jakob and Liu, Karen and Liu, Ziwei and De Nardi, Renzo and Newcombe, Richard},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72691-0_25},
url = {https://mlanthology.org/eccv/2024/ma2024eccv-nymeria/}
}