Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space

Abstract

Content-based event retrieval in unconstrained web videos, based on query tags, is a hard problem due to large intra-class variances, and limited vocabulary and accuracy of the video concept detectors, creating a "semantic query gap". We present a technique to overcome this gap by using continuous word space representations to explicitly compute query and detector concept similarity. This not only allows for fast query-video similarity computation with implicit query expansion, but leads to a compact video representation, which allows implementation of a real-time retrieval system that can fit several thousand videos in a few hundred megabytes of memory. We evaluate the effectiveness of our representation on the challenging NIST MEDTest 2014 dataset.

Cite

Text

Agharwal et al. "Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477706

Markdown

[Agharwal et al. "Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/agharwal2016wacv-tag/) doi:10.1109/WACV.2016.7477706

BibTeX

@inproceedings{agharwal2016wacv-tag,
  title     = {{Tag-Based Video Retrieval by Embedding Semantic Content in a Continuous Word Space}},
  author    = {Agharwal, Arnav and Kovvuri, Rama and Nevatia, Ram and Snoek, Cees G. M.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2016},
  pages     = {1-8},
  doi       = {10.1109/WACV.2016.7477706},
  url       = {https://mlanthology.org/wacv/2016/agharwal2016wacv-tag/}
}