Pose and Joint-Aware Action Recognition
Abstract
Recent progress on action recognition has mainly focused on RGB and optical flow features. In this paper, we approach the problem of joint-based action recognition. Unlike other modalities, constellation of joints and their motion generate models with succinct human motion information for activity recognition. We present a new model for joint-based action recognition, which first extracts motion features from each joint separately through a shared motion encoder before performing collective reasoning. Our joint selector module re-weights the joint information to select the most discriminative joints for the task. We also propose a novel joint-contrastive loss that pulls together groups of joint features which convey the same action. We strengthen the joint-based representations by using a geometry-aware data augmentation technique which jitters pose heatmaps while retaining the dynamics of the action. We show large improvements over the current state-of-the-art joint-based approaches on JHMDB, HMDB, Charades, AVA action recognition datasets. A late fusion with RGB and Flow-based approaches yields additional improvements. Our model also outperforms the existing baseline on Mimetics, a dataset with out-of-context actions.
Cite
Text
Shah et al. "Pose and Joint-Aware Action Recognition." Winter Conference on Applications of Computer Vision, 2022.Markdown
[Shah et al. "Pose and Joint-Aware Action Recognition." Winter Conference on Applications of Computer Vision, 2022.](https://mlanthology.org/wacv/2022/shah2022wacv-pose/)BibTeX
@inproceedings{shah2022wacv-pose,
title = {{Pose and Joint-Aware Action Recognition}},
author = {Shah, Anshul and Mishra, Shlok and Bansal, Ankan and Chen, Jun-Cheng and Chellappa, Rama and Shrivastava, Abhinav},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2022},
pages = {3850-3860},
url = {https://mlanthology.org/wacv/2022/shah2022wacv-pose/}
}