Learning Decoupled Representations for Human Pose Forecasting

Abstract

Human pose forecasting involves complex spatiotemporal interactions between body parts (e.g., arms, legs, spine). State-of-the-art approaches use Long Short-Term Memories (LSTMs) or Variational AutoEncoders (VAEs) to solve the problem. Yet, they do not effectively predict human motions when both global trajectory and local pose movements exist. We propose to learn decoupled representations for the global and local pose forecasting tasks. We also show that it is better to stop the prediction when the uncertainty in human motion increases. Our forecasting model outperforms all existing methods on the pose forecasting benchmark to date by over 20%. The code is available online †.

Cite

Text

Parsaeifard et al. "Learning Decoupled Representations for Human Pose Forecasting." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00259

Markdown

[Parsaeifard et al. "Learning Decoupled Representations for Human Pose Forecasting." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/parsaeifard2021iccvw-learning/) doi:10.1109/ICCVW54120.2021.00259

BibTeX

@inproceedings{parsaeifard2021iccvw-learning,
  title     = {{Learning Decoupled Representations for Human Pose Forecasting}},
  author    = {Parsaeifard, Behnam and Saadatnejad, Saeed and Liu, Yuejiang and Mordan, Taylor and Alahi, Alexandre},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2021},
  pages     = {2294-2303},
  doi       = {10.1109/ICCVW54120.2021.00259},
  url       = {https://mlanthology.org/iccvw/2021/parsaeifard2021iccvw-learning/}
}