Neighborhood Preserving Hashing for Scalable Video Retrieval
Abstract
In this paper, we propose a Neighborhood Preserving Hashing (NPH) method for scalable video retrieval in an unsupervised manner. Unlike most existing deep video hashing methods which indiscriminately compress an entire video into a binary code, we embed the spatial-temporal neighborhood information into the encoding network such that the neighborhood-relevant visual content of a video can be preferentially encoded into a binary code under the guidance of the neighborhood information. Specifically, we propose a neighborhood attention mechanism which focuses on partial useful content of each input frame conditioned on the neighborhood information. We then integrate the neighborhood attention mechanism into an RNN-based reconstruction scheme to encourage the binary codes to capture the spatial-temporal structure in a video which is consistent with that in the neighborhood. As a consequence, the learned hashing functions can map similar videos to similar binary codes. Extensive experiments on three widely-used benchmark datasets validate the effectiveness of our proposed approach.
Cite
Text
Li et al. "Neighborhood Preserving Hashing for Scalable Video Retrieval." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00830Markdown
[Li et al. "Neighborhood Preserving Hashing for Scalable Video Retrieval." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/li2019iccv-neighborhood/) doi:10.1109/ICCV.2019.00830BibTeX
@inproceedings{li2019iccv-neighborhood,
title = {{Neighborhood Preserving Hashing for Scalable Video Retrieval}},
author = {Li, Shuyan and Chen, Zhixiang and Lu, Jiwen and Li, Xiu and Zhou, Jie},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
year = {2019},
doi = {10.1109/ICCV.2019.00830},
url = {https://mlanthology.org/iccv/2019/li2019iccv-neighborhood/}
}