Action Recognition from RGB-D Data: Comparison and Fusion of Spatio-Temporal Handcrafted Features and Deep Strategies

Abstract

In this work, multimodal fusion of RGB-D data are analyzed for action recognition by using scene flow as early fusion and integrating the results of all modalities in a late fusion fashion. Recently, there is a migration from traditional handcrafting to deep learning. However, handcrafted features are still widely used owing to their high performance and low computational complexity. In this research, Multimodal dense trajectories (MMDT) is proposed to describe RGB-D videos. Dense trajectories are pruned based on scene flow data. Besides, 2DCNN is extended to multimodal (MM2DCNN) by adding one more stream (scene flow) as input and then fusing the output of all models. We evaluate and compare the results from each modality and their fusion on two action datasets. The experimental result shows that the new representation improves the accuracy. Furthermore, the fusion of handcrafted and learning-based features shows a boost in the final performance, achieving state of the art results.

Cite

Text

Asadi-Aghbolaghi et al. "Action Recognition from RGB-D Data: Comparison and Fusion of Spatio-Temporal Handcrafted Features and Deep Strategies." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.376

Markdown

[Asadi-Aghbolaghi et al. "Action Recognition from RGB-D Data: Comparison and Fusion of Spatio-Temporal Handcrafted Features and Deep Strategies." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/asadiaghbolaghi2017iccvw-action/) doi:10.1109/ICCVW.2017.376

BibTeX

@inproceedings{asadiaghbolaghi2017iccvw-action,
  title     = {{Action Recognition from RGB-D Data: Comparison and Fusion of Spatio-Temporal Handcrafted Features and Deep Strategies}},
  author    = {Asadi-Aghbolaghi, Maryam and Bertiche, Hugo and Roig, Vicent and Kasaei, Shohreh and Escalera, Sergio},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {3179-3188},
  doi       = {10.1109/ICCVW.2017.376},
  url       = {https://mlanthology.org/iccvw/2017/asadiaghbolaghi2017iccvw-action/}
}