Fast and Bayes-Consistent Nearest Neighbors

Abstract

Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects – either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm’s prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point.

Cite

Text

Efremenko et al. "Fast and Bayes-Consistent Nearest Neighbors." Artificial Intelligence and Statistics, 2020.

Markdown

[Efremenko et al. "Fast and Bayes-Consistent Nearest Neighbors." Artificial Intelligence and Statistics, 2020.](https://mlanthology.org/aistats/2020/efremenko2020aistats-fast/)

BibTeX

@inproceedings{efremenko2020aistats-fast,
  title     = {{Fast and Bayes-Consistent Nearest Neighbors}},
  author    = {Efremenko, Klim and Kontorovich, Aryeh and Noivirt, Moshe},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2020},
  pages     = {1276-1286},
  volume    = {108},
  url       = {https://mlanthology.org/aistats/2020/efremenko2020aistats-fast/}
}