Leveraging Temporal, Contextual and Ordering Constraints for Recognizing Complex Activities in Video
Abstract
We present a scalable approach to recognizing and describing complex activities in video sequences. We are interested in long-term, sequential activities that may have several parallel streams of action. Our approach integrates temporal, contextual and ordering constraints with output from low-level visual detectors to recognize complex, long-term activities. We argue that a hierarchical, object-oriented design lends our solution to be scalable in that higher-level reasoning components are independent from the particular low-level detector implementation and that recognition of additional activities and actions can easily be added. Three major components to realize this design are: a dynamic Bayesian network structure for representing activities comprised of partially ordered sub-actions, an object-oriented action hierarchy for building arbitrarily complex action detectors and an approximate Viterbi-like algorithm for inferring the most likely observed sequence of actions. Additionally, this study proposes the Erlang distribution as a comprehensive model of idle time between actions and frequency of observing new actions. We show results for our approach on real video sequences containing complex activities.
Cite
Text
Laxton et al. "Leveraging Temporal, Contextual and Ordering Constraints for Recognizing Complex Activities in Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383074Markdown
[Laxton et al. "Leveraging Temporal, Contextual and Ordering Constraints for Recognizing Complex Activities in Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/laxton2007cvpr-leveraging/) doi:10.1109/CVPR.2007.383074BibTeX
@inproceedings{laxton2007cvpr-leveraging,
title = {{Leveraging Temporal, Contextual and Ordering Constraints for Recognizing Complex Activities in Video}},
author = {Laxton, Benjamin and Lim, Jongwoo and Kriegman, David J.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2007},
doi = {10.1109/CVPR.2007.383074},
url = {https://mlanthology.org/cvpr/2007/laxton2007cvpr-leveraging/}
}