Multi-Source Multi-Modal Activity Recognition in Aerial Video Surveillance

Abstract

Recognizing activities in wide aerial/overhead imagery remains a challenging problem due in part to low-resolution video and cluttered scenes with a large number of moving objects. In the context of this research, we deal with two un-synchronized data sources collected in real-world operating scenarios: full-motion videos (FMV) and analyst call-outs (ACO) in the form of chat messages (voice-to-text) made by a human watching the streamed FMV from an aerial platform. We present a multi-source multi-modal activity/event recognition system for surveillance applications, consisting of: (1) detecting and tracking multiple dynamic targets from a moving platform, (2) representing FMV target tracks and chat messages as graphs of attributes, (3) associating FMV tracks and chat messages using a probabilistic graph-based matching approach, and (4) detecting spatial-temporal activity boundaries. We also present an activity pattern learning framework which uses the multi-source associated data as training to index a large archive of FMV videos. Finally, we describe a multi-intelligence user interface for querying an index of activities of interest (AOIs) by movement type and geo-location, and for playing-back a summary of associated text (ACO) and activity video segments of targets-of-interest (TOIs) (in both pixel and geo-coordinates). Such tools help the end-user to quickly search, browse, and prepare mission reports from multi-source data.

Cite

Text

Hammoud et al. "Multi-Source Multi-Modal Activity Recognition in Aerial Video Surveillance." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2014. doi:10.1109/CVPRW.2014.44

Markdown

[Hammoud et al. "Multi-Source Multi-Modal Activity Recognition in Aerial Video Surveillance." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2014.](https://mlanthology.org/cvprw/2014/hammoud2014cvprw-multisource/) doi:10.1109/CVPRW.2014.44

BibTeX

@inproceedings{hammoud2014cvprw-multisource,
  title     = {{Multi-Source Multi-Modal Activity Recognition in Aerial Video Surveillance}},
  author    = {Hammoud, Riad I. and Sahin, Cem S. and Blasch, Erik Philip and Rhodes, Bradley J.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2014},
  pages     = {237-244},
  doi       = {10.1109/CVPRW.2014.44},
  url       = {https://mlanthology.org/cvprw/2014/hammoud2014cvprw-multisource/}
}