Complex Events Detection Using Data-Driven Concepts

Abstract

Automatic event detection in a large collection of unconstrained videos is a challenging and important task. The key issue is to describe long complex video with high level semantic descriptors, which should find the regularity of events in the same category while distinguish those from different categories. This paper proposes a novel unsupervised approach to discover data-driven concepts from multi-modality signals (audio, scene and motion) to describe high level semantics of videos. Our methods consists of three main components: we first learn the low-level features separately from three modalities. Secondly we discover the data-driven concepts based on the statistics of learned features mapped to a low dimensional space using deep belief nets (DBNs). Finally, a compact and robust sparse representation is learned to jointly model the concepts from all three modalities. Extensive experimental results on large in-the-wild dataset show that our proposed method significantly outperforms state-of-the-art methods.

Cite

Text

Yang and Shah. "Complex Events Detection Using Data-Driven Concepts." European Conference on Computer Vision, 2012. doi:10.1007/978-3-642-33712-3_52

Markdown

[Yang and Shah. "Complex Events Detection Using Data-Driven Concepts." European Conference on Computer Vision, 2012.](https://mlanthology.org/eccv/2012/yang2012eccv-complex/) doi:10.1007/978-3-642-33712-3_52

BibTeX

@inproceedings{yang2012eccv-complex,
  title     = {{Complex Events Detection Using Data-Driven Concepts}},
  author    = {Yang, Yang and Shah, Mubarak},
  booktitle = {European Conference on Computer Vision},
  year      = {2012},
  pages     = {722-735},
  doi       = {10.1007/978-3-642-33712-3_52},
  url       = {https://mlanthology.org/eccv/2012/yang2012eccv-complex/}
}