Near Neighbor: Who Is the Fairest of Them All?

Har-Peled, Sariel; Mahabadi, Sepideh

Near Neighbor: Who Is the Fairest of Them All?

NeurIPS 2019 pp. 13176-13187

/neurips/2019/harpeled2019neurips-near/

Abstract

In this work we study a "fair" variant of the near neighbor problem. Namely, given a set of $n$ points $P$ and a parameter $r$, the goal is to preprocess the points, such that given a query point $q$, any point in the $r$-neighborhood of the query, i.e., $B(q, r)$, have the same probability of being reported as the near neighbor. We show that LSH based algorithms can be made fair, without a significant loss in efficiency. Specifically, we show an algorithm that reports a point $p$ in the $r$-neighborhood of a query $q$ with almost uniform probability. The time to report such a point is proportional to $O(\dns(q.r) Q(n, c))$, and its space is $O(S(n, c))$, where $Q(n, c)$ and $S(n, c)$ are the query time and space of an LSH algorithm for $c$-approximate near neighbor, and $\dns(q, r)$ is a function of the local density around $q$. Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. Finally, we run experiments to show performance of our approach on real data.

PDF NeurIPS Semantic Scholar

Cite

Text

Har-Peled and Mahabadi. "Near Neighbor: Who Is the Fairest of Them All?." Neural Information Processing Systems, 2019.

Markdown

[Har-Peled and Mahabadi. "Near Neighbor: Who Is the Fairest of Them All?." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/harpeled2019neurips-near/)

BibTeX

@inproceedings{harpeled2019neurips-near,
  title     = {{Near Neighbor: Who Is the Fairest of Them All?}},
  author    = {Har-Peled, Sariel and Mahabadi, Sepideh},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {13176-13187},
  url       = {https://mlanthology.org/neurips/2019/harpeled2019neurips-near/}
}