Norm-Ranging LSH for Maximum Inner Product Search

Abstract

Neyshabur and Srebro proposed SIMPLE-LSH, which is the state-of-the-art hashing based algorithm for maximum inner product search (MIPS). We found that the performance of SIMPLE-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose NORM-RANGING LSH, which addresses the excessive normalization problem caused by long tails by partitioning a dataset into sub-datasets and building a hash index for each sub-dataset independently. We prove that NORM-RANGING LSH achieves lower query time complexity than SIMPLE-LSH under mild conditions. We also show that the idea of dataset partitioning can improve another hashing based MIPS algorithm. Experiments show that NORM-RANGING LSH probes much less items than SIMPLE-LSH at the same recall, thus significantly benefiting MIPS based applications.

Cite

Text

Yan et al. "Norm-Ranging LSH for Maximum Inner Product Search." Neural Information Processing Systems, 2018.

Markdown

[Yan et al. "Norm-Ranging LSH for Maximum Inner Product Search." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/yan2018neurips-normranging/)

BibTeX

@inproceedings{yan2018neurips-normranging,
  title     = {{Norm-Ranging LSH for Maximum Inner Product Search}},
  author    = {Yan, Xiao and Li, Jinfeng and Dai, Xinyan and Chen, Hongzhi and Cheng, James},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {2952-2961},
  url       = {https://mlanthology.org/neurips/2018/yan2018neurips-normranging/}
}