Deep Cross-Modal Hashing
Abstract
Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, most existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with hand-crafted features may not achieve satisfactory performance. In this paper, we propose a novel CMH method, called deep cross-modal hashing (DCMH), by integrating feature learning and hash-code learning intothe same framework. DCMH is an end-to-end learning framework with deep neural networks, one for each modality, to perform feature learning from scratch. Experiments on three real datasets with image-text modalities show that DCMH can outperform other baselines to achieve the state-of-the-art performance in cross-modal retrieval applications.
Cite
Text
Jiang and Li. "Deep Cross-Modal Hashing." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.348Markdown
[Jiang and Li. "Deep Cross-Modal Hashing." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/jiang2017cvpr-deep/) doi:10.1109/CVPR.2017.348BibTeX
@inproceedings{jiang2017cvpr-deep,
title = {{Deep Cross-Modal Hashing}},
author = {Jiang, Qing-Yuan and Li, Wu-Jun},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2017},
doi = {10.1109/CVPR.2017.348},
url = {https://mlanthology.org/cvpr/2017/jiang2017cvpr-deep/}
}