Probabilistic Semantic Video Indexing
Abstract
We propose a novel probabilistic framework for semantic video in(cid:173) dexing. We define probabilistic multimedia objects (multijects) to map low-level media features to high-level semantic labels. A graphical network of such multijects (multinet) captures scene con(cid:173) text by discovering intra-frame as well as inter-frame dependency relations between the concepts. The main contribution is a novel application of a factor graph framework to model this network. We model relations between semantic concepts in terms of their co-occurrence as well as the temporal dependencies between these concepts within video shots. Using the sum-product algorithm [1] for approximate or exact inference in these factor graph multinets, we attempt to correct errors made during isolated concept detec(cid:173) tion by forcing high-level constraints. This results in a significant improvement in the overall detection performance.
Cite
Text
Naphade et al. "Probabilistic Semantic Video Indexing." Neural Information Processing Systems, 2000.Markdown
[Naphade et al. "Probabilistic Semantic Video Indexing." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/naphade2000neurips-probabilistic/)BibTeX
@inproceedings{naphade2000neurips-probabilistic,
title = {{Probabilistic Semantic Video Indexing}},
author = {Naphade, Milind R. and Kozintsev, Igor and Huang, Thomas S.},
booktitle = {Neural Information Processing Systems},
year = {2000},
pages = {967-973},
url = {https://mlanthology.org/neurips/2000/naphade2000neurips-probabilistic/}
}