MotionMixer: MLP-Based 3D Human Body Pose Forecasting

Abstract

In this work, we present MotionMixer, an efficient 3D human body pose forecasting model based solely on multi-layer perceptrons (MLPs). MotionMixer learns the spatial-temporal 3D body pose dependencies by sequentially mixing both modalities. Given a stacked sequence of 3D body poses, a spatial-MLP extracts fine-grained spatial dependencies of the body joints. The interaction of the body joints over time is then modelled by a temporal MLP. The spatial-temporal mixed features are finally aggregated and decoded to obtain the future motion. To calibrate the influence of each time step in the pose sequence, we make use of squeeze-and-excitation (SE) blocks. We evaluate our approach on Human3.6M, AMASS, and 3DPW datasets using the standard evaluation protocols. For all evaluations, we demonstrate state-of-the-art performance, while having a model with a smaller number of parameters. Our code is available at: https://github.com/MotionMLP/MotionMixer.

Cite

Text

Bouazizi et al. "MotionMixer: MLP-Based 3D Human Body Pose Forecasting." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/111

Markdown

[Bouazizi et al. "MotionMixer: MLP-Based 3D Human Body Pose Forecasting." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/bouazizi2022ijcai-motionmixer/) doi:10.24963/IJCAI.2022/111

BibTeX

@inproceedings{bouazizi2022ijcai-motionmixer,
  title     = {{MotionMixer: MLP-Based 3D Human Body Pose Forecasting}},
  author    = {Bouazizi, Arij and Holzbock, Adrian and Kressel, Ulrich and Dietmayer, Klaus and Belagiannis, Vasileios},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {791-798},
  doi       = {10.24963/IJCAI.2022/111},
  url       = {https://mlanthology.org/ijcai/2022/bouazizi2022ijcai-motionmixer/}
}