FineSports: A Multi-Person Hierarchical Sports Video Dataset for Fine-Grained Action Understanding

Abstract

Fine-grained action analysis in multi-person sports is complex due to athletes' quick movements and intense physical confrontations which result in severe visual obstructions in most scenes. In addition accessible multi-person sports video datasets lack fine-grained action annotations in both space and time adding to the difficulty in fine-grained action analysis. To this end we construct a new multi-person basketball sports video dataset named FineSports which contains fine-grained semantic and spatial-temporal annotations on 10000 NBA game videos covering 52 fine-grained action types 16000 action instances and 123000 spatial-temporal bounding boxes. We also propose a new prompt-driven spatial-temporal action location approach called PoSTAL composed of a prompt-driven target action encoder (PTA) and an action tube-specific detector (ATD) to directly generate target action tubes with fine-grained action types without any off-line proposal generation. Extensive experiments on the FineSports dataset demonstrate that PoSTAL outperforms state-of-the-art methods. Data and code are available at https://github.com/PKU-ICST-MIPL/FineSports_CVPR2024.

Cite

Text

Xu et al. "FineSports: A Multi-Person Hierarchical Sports Video Dataset for Fine-Grained Action Understanding." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02057

Markdown

[Xu et al. "FineSports: A Multi-Person Hierarchical Sports Video Dataset for Fine-Grained Action Understanding." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/xu2024cvpr-finesports/) doi:10.1109/CVPR52733.2024.02057

BibTeX

@inproceedings{xu2024cvpr-finesports,
  title     = {{FineSports: A Multi-Person Hierarchical Sports Video Dataset for Fine-Grained Action Understanding}},
  author    = {Xu, Jinglin and Zhao, Guohao and Yin, Sibo and Zhou, Wenhao and Peng, Yuxin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {21773-21782},
  doi       = {10.1109/CVPR52733.2024.02057},
  url       = {https://mlanthology.org/cvpr/2024/xu2024cvpr-finesports/}
}