Object-Relation Reasoning Graph for Action Recognition

Abstract

Action recognition is a challenging task since the attributes of objects as well as their relationships change constantly in the video. Existing methods mainly use object-level graphs or scene graphs to represent the dynamics of objects and relationships, but ignore modeling the fine-grained relationship transitions directly. In this paper, we propose an Object-Relation Reasoning Graph (OR2G) for reasoning about action in videos. By combining an object-level graph (OG) and a relation-level graph (RG), the proposed OR2G catches the attribute transitions of objects and reasons about the relationship transitions between objects simultaneously. In addition, a graph aggregating module (GAM) is investigated by applying the multi-head edge-to-node message passing operation. GAM feeds back the information from the relation node to the object node and enhances the coupling between the object-level graph and the relation-level graph. Experiments in video action recognition demonstrate the effectiveness of our approach when compared with the state-of-the-art methods.

Cite

Text

Ou et al. "Object-Relation Reasoning Graph for Action Recognition." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01950

Markdown

[Ou et al. "Object-Relation Reasoning Graph for Action Recognition." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/ou2022cvpr-objectrelation/) doi:10.1109/CVPR52688.2022.01950

BibTeX

@inproceedings{ou2022cvpr-objectrelation,
  title     = {{Object-Relation Reasoning Graph for Action Recognition}},
  author    = {Ou, Yangjun and Mi, Li and Chen, Zhenzhong},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {20133-20142},
  doi       = {10.1109/CVPR52688.2022.01950},
  url       = {https://mlanthology.org/cvpr/2022/ou2022cvpr-objectrelation/}
}