Dual Attention Poser: Dual Path Body Tracking Based on Attention

Abstract

Currently, mixed reality head-mounted displays tracking the full body of users is an important human-computer interaction mode through the pose of the head and the hands. Unfortunately, users’ virtual representation and experience is limited due to high reconstruction error when simple transformer network architecture is applied. In this paper, we present a novel model, named Dual Attention Poser, which can learn the whole body reconstruction at a high accuracy. The proposed model consists of three key modules. Among them, dual-path attention encoder is designed to extract feature of the sparse signals. Cross attention mixer module enable the fusion of representation in the double path. Attention-gated-mlp decoder is applied to decode the hidden feature from the sparse input through attention gate. Test results on the AMASS dataset show that Dual Attention Poser can reduce the error by up to 18.2% in comparison with the state-of-the-art results.

Cite

Text

Di et al. "Dual Attention Poser: Dual Path Body Tracking Based on Attention." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023. doi:10.1109/CVPRW59228.2023.00280

Markdown

[Di et al. "Dual Attention Poser: Dual Path Body Tracking Based on Attention." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.](https://mlanthology.org/cvprw/2023/di2023cvprw-dual/) doi:10.1109/CVPRW59228.2023.00280

BibTeX

@inproceedings{di2023cvprw-dual,
  title     = {{Dual Attention Poser: Dual Path Body Tracking Based on Attention}},
  author    = {Di, Xinhan and Dai, Xiaokun and Zhang, Xinkang and Chen, Xinrong},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2023},
  pages     = {2795-2804},
  doi       = {10.1109/CVPRW59228.2023.00280},
  url       = {https://mlanthology.org/cvprw/2023/di2023cvprw-dual/}
}