Cover Trees for Nearest Neighbor

Abstract

We present a tree data structure for fast nearest neighbor operations in general n-point metric spaces (where the data set consists of n points). The data structure requires O(n) space regardless of the metric's structure yet maintains all performance properties of a navigating net (Krauthgamer & Lee, 2004b). If the point set has a bounded expansion constant c, which is a measure of the intrinsic dimensionality, as defined in (Karger & Ruhl, 2002), the cover tree data structure can be constructed in O (c6n log n) time. Furthermore, nearest neighbor queries require time only logarithmic in n, in particular O (c12 log n) time. Our experimental results show speedups over the brute force search varying between one and several orders of magnitude on natural machine learning datasets.

Cite

Text

Beygelzimer et al. "Cover Trees for Nearest Neighbor." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143857

Markdown

[Beygelzimer et al. "Cover Trees for Nearest Neighbor." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/beygelzimer2006icml-cover/) doi:10.1145/1143844.1143857

BibTeX

@inproceedings{beygelzimer2006icml-cover,
  title     = {{Cover Trees for Nearest Neighbor}},
  author    = {Beygelzimer, Alina and Kakade, Sham M. and Langford, John},
  booktitle = {International Conference on Machine Learning},
  year      = {2006},
  pages     = {97-104},
  doi       = {10.1145/1143844.1143857},
  url       = {https://mlanthology.org/icml/2006/beygelzimer2006icml-cover/}
}