BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Abstract

The recently emerging text-to-motion advances have spired numerous attempts for convenient and interactive human motion generation. Yet existing methods are largely limited to generating body motions only without considering the rich two-hand motions let alone handling various conditions like body dynamics or texts. To break the data bottleneck we propose BOTH57M a novel multi-modal dataset for two-hand motion generation. Our dataset includes accurate motion tracking for the human body and hands and provides pair-wised finger-level hand annotations and body descriptions. We further provide a strong baseline method BOTH2Hands for the novel task: generating vivid two-hand motions from both implicit body dynamics and explicit text prompts. We first warm up two parallel body-to-hand and text-to-hand diffusion models and then utilize the cross-attention transformer for motion blending. Extensive experiments and cross-validations demonstrate the effectiveness of our approach and dataset for generating convincing two-hand motions from the hybrid body-and-textual conditions. Our dataset and code will be disseminated to the community for future research which can be found at https://github.com/Godheritage/BOTH2Hands.

Cite

Text

Zhang et al. "BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00232

Markdown

[Zhang et al. "BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/zhang2024cvpr-both2hands/) doi:10.1109/CVPR52733.2024.00232

BibTeX

@inproceedings{zhang2024cvpr-both2hands,
  title     = {{BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics}},
  author    = {Zhang, Wenqian and Huang, Molin and Zhou, Yuxuan and Zhang, Juze and Yu, Jingyi and Wang, Jingya and Xu, Lan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {2393-2404},
  doi       = {10.1109/CVPR52733.2024.00232},
  url       = {https://mlanthology.org/cvpr/2024/zhang2024cvpr-both2hands/}
}