SAVE: A Framework for Semantic Annotation of Visual Events
Abstract
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations; and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the Video Event Markup Language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.
Cite
Text
Lee et al. "SAVE: A Framework for Semantic Annotation of Visual Events." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2008. doi:10.1109/CVPRW.2008.4562954Markdown
[Lee et al. "SAVE: A Framework for Semantic Annotation of Visual Events." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2008.](https://mlanthology.org/cvprw/2008/lee2008cvprw-save/) doi:10.1109/CVPRW.2008.4562954BibTeX
@inproceedings{lee2008cvprw-save,
title = {{SAVE: A Framework for Semantic Annotation of Visual Events}},
author = {Lee, Mun Wai and Hakeem, Asaad and Haering, Niels and Zhu, Song-Chun},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2008},
pages = {1-8},
doi = {10.1109/CVPRW.2008.4562954},
url = {https://mlanthology.org/cvprw/2008/lee2008cvprw-save/}
}