PAND: Precise Action Recognition on Naturalistic Driving

Abstract

Temporal action localization for untrimmed videos is a difficult problem in computer vision. It is challenge to infer the start and end of activity instances on small-scale datasets covering multi-view information accurately. In this paper, we propose an effective activity temporal localization and classification method to localize the temporal boundaries and predict the class label of activities for naturalistic driving. Our approach includes (i) a distraction behavior recognition and localization method in naturalistic driving videos on small-scale data sets, (ii) a strategy that uses multi-branch network to make full use of information from different channels, (iii)a post-processing method for selecting and correcting temporal range to ensure that our system finds accurate boundaries. In addition, the frame-level object detection information is also utilized. Extensive experiments prove the effectiveness of our method and we rank the 6th on the Test-A2 of the 6th AI City Challenge track 3.

Cite

Text

Zhao et al. "PAND: Precise Action Recognition on Naturalistic Driving." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00372

Markdown

[Zhao et al. "PAND: Precise Action Recognition on Naturalistic Driving." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/zhao2022cvprw-pand/) doi:10.1109/CVPRW56347.2022.00372

BibTeX

@inproceedings{zhao2022cvprw-pand,
  title     = {{PAND: Precise Action Recognition on Naturalistic Driving}},
  author    = {Zhao, Hangyue and Xiao, Yuchao and Zhao, Yanyun},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {3290-3298},
  doi       = {10.1109/CVPRW56347.2022.00372},
  url       = {https://mlanthology.org/cvprw/2022/zhao2022cvprw-pand/}
}