The Multi-Modal Video Reasoning and Analyzing Competition
Abstract
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021. This competition is composed of four different tracks, namely, video question answering, skeleton-based action recognition, fisheye video-based action recognition, and person re-identification, which are based on two datasets: SUTD-TrafficQA and UAV-Human. We summarize the top performing methods submitted by the participants in this competition, and show their results achieved in the competition.
Cite
Text
Peng et al. "The Multi-Modal Video Reasoning and Analyzing Competition." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00095Markdown
[Peng et al. "The Multi-Modal Video Reasoning and Analyzing Competition." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/peng2021iccvw-multimodal/) doi:10.1109/ICCVW54120.2021.00095BibTeX
@inproceedings{peng2021iccvw-multimodal,
title = {{The Multi-Modal Video Reasoning and Analyzing Competition}},
author = {Peng, Haoran and Huang, He and Xu, Li and Li, Tianjiao and Liu, Jun and Rahmani, Hossein and Ke, Qiuhong and Guo, Zhicheng and Wu, Cong and Li, Rongchang and Ye, Mang and Wang, Jiahao and Zhang, Jiaxu and Liu, Yuanzhong and He, Tao and Zhang, Fuwei and Liu, Xianbin and Lin, Tao},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2021},
pages = {806-813},
doi = {10.1109/ICCVW54120.2021.00095},
url = {https://mlanthology.org/iccvw/2021/peng2021iccvw-multimodal/}
}