EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
Abstract
We present EgoExo-Fitness, a new full-body action understanding dataset, featuring fitness sequence videos recorded from synchronized egocentric and fixed exocentric (third-person) cameras. Compared with existing full-body action understanding datasets, EgoExo-Fitness not only contains videos from first-person perspectives, but also provides rich annotations. Specifically, two-level temporal boundaries are provided to localize single action videos along with sub-steps of each action. More importantly, EgoExo-Fitness introduces innovative annotations for interpretable action judgement–including technical keypoint verification, natural language comments on action execution, and action quality scores. Combining all of these, EgoExo-Fitness provides new resources to study egocentric and exocentric full-body action understanding across dimensions of “what”, “when”, and “how well”. To facilitate research on egocentric and exocentric full-body action understanding, we construct benchmarks on a suite of tasks (, action classification, action localization, cross-view sequence verification, cross-view skill determination, and a newly proposed task of guidance-based execution verification), together with detailed analysis. Data and code are available at https://github.com/iSEE-Laboratory/EgoExo-Fitness/tree/main.
Cite
Text
Li et al. "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72661-3_21Markdown
[Li et al. "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/li2024eccv-egoexofitness/) doi:10.1007/978-3-031-72661-3_21BibTeX
@inproceedings{li2024eccv-egoexofitness,
title = {{EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding}},
author = {Li, Yuan-Ming and Huang, Wei-Jin and Wang, An-Lan and Zeng, Ling-An and Meng, Jing-Ke and Zheng, Wei-Shi},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72661-3_21},
url = {https://mlanthology.org/eccv/2024/li2024eccv-egoexofitness/}
}