Deep Cross-Modal Hashing

Abstract

Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, most existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with hand-crafted features may not achieve satisfactory performance. In this paper, we propose a novel CMH method, called deep cross-modal hashing (DCMH), by integrating feature learning and hash-code learning intothe same framework. DCMH is an end-to-end learning framework with deep neural networks, one for each modality, to perform feature learning from scratch. Experiments on three real datasets with image-text modalities show that DCMH can outperform other baselines to achieve the state-of-the-art performance in cross-modal retrieval applications.

Cite

Text

Jiang and Li. "Deep Cross-Modal Hashing." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.348

Markdown

[Jiang and Li. "Deep Cross-Modal Hashing." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/jiang2017cvpr-deep/) doi:10.1109/CVPR.2017.348

BibTeX

@inproceedings{jiang2017cvpr-deep,
  title     = {{Deep Cross-Modal Hashing}},
  author    = {Jiang, Qing-Yuan and Li, Wu-Jun},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2017},
  doi       = {10.1109/CVPR.2017.348},
  url       = {https://mlanthology.org/cvpr/2017/jiang2017cvpr-deep/}
}