EgoBody3M: Egocentric Body Tracking on a VR Headset Using a Diverse Dataset

Abstract

Accurate tracking of a user’s body pose while wearing a virtual reality (VR), augmented reality (AR) or mixed reality (MR) headset is a prerequisite for authentic self-expression, natural social presence, and intuitive user interfaces. Existing body tracking approaches on VR/AR devices are either under-constrained, e.g., attempting to infer full body pose from only headset and controller pose, or require impractical hardware setups that place cameras far from a user’s face to improve body visibility. In this paper, we present the first controller-less egocentric body tracking solution that runs on an actual VR device using the same cameras that are used for SLAM tracking. We propose a novel egocentric tracking architecture that models the temporal history of body motion using multi-view latent features. Furthermore, we release the first large-scale real-image dataset for egocentric body tracking, , with a realistic VR headset configuration and diverse subjects and motions. Benchmarks on the dataset shows that our approach outperforms other state-of-the-art methods in both accuracy and smoothness of the resulting motion. We perform ablation studies on our model choices and demonstrate the method running in realtime on a VR headset. Our dataset with more than 30 hours of recordings and 3 million frames will be made publicly available.

Cite

Text

Zhao et al. "EgoBody3M: Egocentric Body Tracking on a VR Headset Using a Diverse Dataset." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72986-7_22

Markdown

[Zhao et al. "EgoBody3M: Egocentric Body Tracking on a VR Headset Using a Diverse Dataset." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/zhao2024eccv-egobody3m/) doi:10.1007/978-3-031-72986-7_22

BibTeX

@inproceedings{zhao2024eccv-egobody3m,
  title     = {{EgoBody3M: Egocentric Body Tracking on a VR Headset Using a Diverse Dataset}},
  author    = {Zhao, Amy and Tang, Chengcheng and Wang, Lezi and Li, Yijing and Dave, Mihika and Tao, Lingling and Twigg, Christopher D. and Wang, Robert Y.},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72986-7_22},
  url       = {https://mlanthology.org/eccv/2024/zhao2024eccv-egobody3m/}
}