A Multiview Depth-Based Motion Capture Benchmark Dataset for Human Motion Denoising and Enhancement Research
Abstract
The field of human motion enhancement is a rapidly expanding field of study in which depth-based motion capture (D-Mocap) is improved to generate a more accurate counterpart for demanding high precision real-world applications. The D-Mocap that is initially generated relies on commercially available SDKs or open source tools to produce the initial skeletal sequence which works best in an ideal front-facing camera setup. This in turn creates a challenging initialization for human motion enhancement when the camera is not positioned in the ideal forward facing position. Currently there are no multiview D-Mocap datasets which have corresponding time-synced and skeleton-matched optical motion capture (Mocap) reference data for view-invariant motion enhancement. We develop a multiview D-Mocap dataset extended from the popular and comprehensive Berkeley MHAD dataset [29]. In addition, we analyze the performance of the D-Mocap data generated through a series of open source tools, highlighting the difficulty and the need to produce robust results in a rear-facing camera setup due to a 21.4% increase in average joint position error over front-facing data. Finally, we analyze the results of some recent human motion enhancement algorithms with regard to a front-facing camera setup versus a rear-facing one.
Cite
Text
Lannan et al. "A Multiview Depth-Based Motion Capture Benchmark Dataset for Human Motion Denoising and Enhancement Research." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00058Markdown
[Lannan et al. "A Multiview Depth-Based Motion Capture Benchmark Dataset for Human Motion Denoising and Enhancement Research." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/lannan2022cvprw-multiview/) doi:10.1109/CVPRW56347.2022.00058BibTeX
@inproceedings{lannan2022cvprw-multiview,
title = {{A Multiview Depth-Based Motion Capture Benchmark Dataset for Human Motion Denoising and Enhancement Research}},
author = {Lannan, Nate and Zhou, Le and Fan, Guoliang},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2022},
pages = {426-435},
doi = {10.1109/CVPRW56347.2022.00058},
url = {https://mlanthology.org/cvprw/2022/lannan2022cvprw-multiview/}
}