Pyramid Coding for Functional Scene Element Recognition in Video Scenes
Abstract
Recognizing functional scene elemeeents in video scenes based on the behaviors of moving o bjects that interact with them is an emerging problem of interest. Existing approaches have a limited ability to chhharacterize elements such as cross-walks, intersections, anddd buildings that have low activity, are multi-modal, or haveee indirect evidence. Our approach recognizes the low activvvity and multi-model elements (crosswalks/intersections) by introducing a hierarchy of descriptive clusters to ffform a pyramid of codebooks that is sparse in the numbbber of clusters and dense in content. The incorporation ooof local behavioral context such as person-enter-building aaand vehicle-parking nearby enables the detection of elemennnts that do not have direct motion-based evidence, e.g. buuuildings. These two contributions significantly improveee scene element recognition when compared against thhhree state-of-the-art approaches. Results are shown on tyyypical ground level surveillance video and for the first time on the more complex Wide Area Motion Imagery.
Cite
Text
Swears et al. "Pyramid Coding for Functional Scene Element Recognition in Video Scenes." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.50Markdown
[Swears et al. "Pyramid Coding for Functional Scene Element Recognition in Video Scenes." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/swears2013iccv-pyramid/) doi:10.1109/ICCV.2013.50BibTeX
@inproceedings{swears2013iccv-pyramid,
title = {{Pyramid Coding for Functional Scene Element Recognition in Video Scenes}},
author = {Swears, Eran and Hoogs, Anthony and Boyer, Kim},
booktitle = {International Conference on Computer Vision},
year = {2013},
doi = {10.1109/ICCV.2013.50},
url = {https://mlanthology.org/iccv/2013/swears2013iccv-pyramid/}
}