Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space
Abstract
Content-based event retrieval in unconstrained web videos, based on query tags, is a hard problem due to large intra-class variances, and limited vocabulary and accuracy of the video concept detectors, creating a "semantic query gap". We present a technique to overcome this gap by using continuous word space representations to explicitly compute query and detector concept similarity. This not only allows for fast query-video similarity computation with implicit query expansion, but leads to a compact video representation, which allows implementation of a real-time retrieval system that can fit several thousand videos in a few hundred megabytes of memory. We evaluate the effectiveness of our representation on the challenging NIST MEDTest 2014 dataset.
Cite
Text
Agharwal et al. "Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477706Markdown
[Agharwal et al. "Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/agharwal2016wacv-tag/) doi:10.1109/WACV.2016.7477706BibTeX
@inproceedings{agharwal2016wacv-tag,
title = {{Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space}},
author = {Agharwal, Arnav and Kovvuri, Rama and Nevatia, Ram and Snoek, Cees G. M.},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2016},
pages = {1-8},
doi = {10.1109/WACV.2016.7477706},
url = {https://mlanthology.org/wacv/2016/agharwal2016wacv-tag/}
}