Poseur: Direct Human Pose Regression with Transformers

Abstract

We propose a direct, regression-based approach to 2D human pose estimation from single images. We formulate the problem as a sequence prediction task, which we solve using a Transformer network. This network directly learns a regression mapping from images to the keypoint coordinates, without resorting to intermediate representations such as heatmaps. This approach avoids much of the complexity associated with heatmap-based approaches. To overcome the feature misalignment issues of previous regression-based methods, we propose an attention mechanism that adaptively attends to the features that are most relevant to the target keypoints, considerably improving the accuracy. Importantly, our framework is end-to-end differentiable, and naturally learns to exploit the dependencies between keypoints. Experiments on MS-COCO and MPII, two predominant pose-estimation datasets, demonstrate that our method significantly improves upon the state-of-the-art in regression-based pose estimation. More notably, ours is the first regression-based approach to perform favorably compared to the best heatmap-based pose estimation methods. Code is available at: https://github.com/aim-uofa/Poseur

Cite

Text

Mao et al. "Poseur: Direct Human Pose Regression with Transformers." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20068-7_5

Markdown

[Mao et al. "Poseur: Direct Human Pose Regression with Transformers." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/mao2022eccv-poseur/) doi:10.1007/978-3-031-20068-7_5

BibTeX

@inproceedings{mao2022eccv-poseur,
  title     = {{Poseur: Direct Human Pose Regression with Transformers}},
  author    = {Mao, Weian and Ge, Yongtao and Shen, Chunhua and Tian, Zhi and Wang, Xinlong and Wang, Zhibin and van den Hengel, Anton},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-20068-7_5},
  url       = {https://mlanthology.org/eccv/2022/mao2022eccv-poseur/}
}