Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search
Abstract
We go beyond the notion of pairwise similarity and look into search problems with $k$-way similarity functions. In this paper, we focus on problems related to \emph{3-way Jaccard} similarity: $\mathcal{R}^{3way}= \frac{|S_1 \cap S_2 \cap S_3|}{|S_1 \cup S_2 \cup S_3|}$, $S_1, S_2, S_3 \in \mathcal{C}$, where $\mathcal{C}$ is a size $n$ collection of sets (or binary vectors). We show that approximate $\mathcal{R}^{3way}$ similarity search problems admit fast algorithms with provable guarantees, analogous to the pairwise case. Our analysis and speedup guarantees naturally extend to $k$-way resemblance. In the process, we extend traditional framework of \emph{locality sensitive hashing (LSH)} to handle higher order similarities, which could be of independent theoretical interest. The applicability of $\mathcal{R}^{3way}$ search is shown on the Google sets" application. In addition, we demonstrate the advantage of $\mathcal{R}^{3way}$ resemblance over the pairwise case in improving retrieval quality."
Cite
Text
Shrivastava and Li. "Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search." Neural Information Processing Systems, 2013.Markdown
[Shrivastava and Li. "Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search." Neural Information Processing Systems, 2013.](https://mlanthology.org/neurips/2013/shrivastava2013neurips-beyond/)BibTeX
@inproceedings{shrivastava2013neurips-beyond,
title = {{Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search}},
author = {Shrivastava, Anshumali and Li, Ping},
booktitle = {Neural Information Processing Systems},
year = {2013},
pages = {791-799},
url = {https://mlanthology.org/neurips/2013/shrivastava2013neurips-beyond/}
}