The Multi-Modal Video Reasoning and Analyzing Competition

Abstract

In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021. This competition is composed of four different tracks, namely, video question answering, skeleton-based action recognition, fisheye video-based action recognition, and person re-identification, which are based on two datasets: SUTD-TrafficQA and UAV-Human. We summarize the top performing methods submitted by the participants in this competition, and show their results achieved in the competition.

Cite

Text

Peng et al. "The Multi-Modal Video Reasoning and Analyzing Competition." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00095

Markdown

[Peng et al. "The Multi-Modal Video Reasoning and Analyzing Competition." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/peng2021iccvw-multimodal/) doi:10.1109/ICCVW54120.2021.00095

BibTeX

@inproceedings{peng2021iccvw-multimodal,
  title     = {{The Multi-Modal Video Reasoning and Analyzing Competition}},
  author    = {Peng, Haoran and Huang, He and Xu, Li and Li, Tianjiao and Liu, Jun and Rahmani, Hossein and Ke, Qiuhong and Guo, Zhicheng and Wu, Cong and Li, Rongchang and Ye, Mang and Wang, Jiahao and Zhang, Jiaxu and Liu, Yuanzhong and He, Tao and Zhang, Fuwei and Liu, Xianbin and Lin, Tao},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2021},
  pages     = {806-813},
  doi       = {10.1109/ICCVW54120.2021.00095},
  url       = {https://mlanthology.org/iccvw/2021/peng2021iccvw-multimodal/}
}