Efficient Skeleton-Based Action Recognition via Joint-Mapping Strategies

Abstract

Graph convolutional networks (GCNs) have brought remarkable progress in skeleton-based action recognition. However, high computational cost and large model size make models difficult to be applied in real-world embedded system. Specifically, GCN that is applied in automated surveillance system pre-require models such as pedestrian detection and human pose estimation. Therefore, each model should be computationally lightweight and whole process should be operated in real-time. In this paper, we propose two different joint-mapping modules to reduce the number of joint representations, alleviating a total computational cost and model size. Our models achieve better accuracy-latency trade-off compared to previous state-of-the-arts on two datasets, namely NTU RGB+D and NTU RGB+D 120, demonstrating the suitability for practical applications. Furthermore, we measure the latency of the models by using TensorRT framework to compare the models from a practical perspective.

Cite

Text

Kang et al. "Efficient Skeleton-Based Action Recognition via Joint-Mapping Strategies." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Kang et al. "Efficient Skeleton-Based Action Recognition via Joint-Mapping Strategies." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/kang2023wacv-efficient/)

BibTeX

@inproceedings{kang2023wacv-efficient,
  title     = {{Efficient Skeleton-Based Action Recognition via Joint-Mapping Strategies}},
  author    = {Kang, Min-Seok and Kang, Dongoh and Kim, HanSaem},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {3403-3412},
  url       = {https://mlanthology.org/wacv/2023/kang2023wacv-efficient/}
}