Spectral Hashing

Abstract

Semantic hashing seeks compact binary codes of datapoints so that the Hamming distance between codewords correlates with semantic similarity. Hinton et al. used a clever implementation of autoencoders to find such codes. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresh- olded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigen- functions of manifolds, we show how to efficiently calculate the code of a novel datapoint. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes significantly outperform the state-of-the art.

Cite

Text

Weiss et al. "Spectral Hashing." Neural Information Processing Systems, 2008.

Markdown

[Weiss et al. "Spectral Hashing." Neural Information Processing Systems, 2008.](https://mlanthology.org/neurips/2008/weiss2008neurips-spectral/)

BibTeX

@inproceedings{weiss2008neurips-spectral,
  title     = {{Spectral Hashing}},
  author    = {Weiss, Yair and Torralba, Antonio and Fergus, Rob},
  booktitle = {Neural Information Processing Systems},
  year      = {2008},
  pages     = {1753-1760},
  url       = {https://mlanthology.org/neurips/2008/weiss2008neurips-spectral/}
}