Fast Spectral Analysis for Approximate Nearest Neighbor Search

Wang, Jing; Shen, Jie

doi:10.1007/S10994-021-06124-1

Fast Spectral Analysis for Approximate Nearest Neighbor Search

Jing Wang, Jie Shen

MLJ 2022 pp. 2297-2322

doi:10.1007/S10994-021-06124-1 /mlj/2022/wang2022mlj-fast/

Abstract

In large-scale machine learning, of central interest is the problem of approximate nearest neighbor (ANN) search, where the goal is to query particular points that are close to a given object under certain metric. In this paper, we develop a novel data-driven ANN search algorithm where the data structure is learned by fast spectral technique based on s landmarks selected by approximate ridge leverage scores. We show that with overwhelming probability, our algorithm returns the $(1+\epsilon /4)$ ( 1 + ϵ / 4 ) -ANN for any approximation parameter $\epsilon \in (0,1)$ ϵ ∈ ( 0 , 1 ) . A remarkable feature of our algorithm is that it is computationally efficient. Specifically, learning k -length hash codes requires $O((s^3+ns^2)\log n)$ O ( ( s 3 + n s 2 ) log n ) running time and $O(d^2)$ O ( d 2 ) extra space, and returning the $(1+\epsilon /4)$ ( 1 + ϵ / 4 ) -ANN of the query needs $O(k\log n)$ O ( k log n ) running time. The experimental results on computer vision and natural language understanding tasks demonstrate the significant advantage of our algorithm compared to state-of-the-art methods.

PDF MLJ Semantic Scholar

Cite

Text

Wang and Shen. "Fast Spectral Analysis for Approximate Nearest Neighbor Search." Machine Learning, 2022. doi:10.1007/S10994-021-06124-1

Markdown

[Wang and Shen. "Fast Spectral Analysis for Approximate Nearest Neighbor Search." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/wang2022mlj-fast/) doi:10.1007/S10994-021-06124-1

BibTeX

@article{wang2022mlj-fast,
  title     = {{Fast Spectral Analysis for Approximate Nearest Neighbor Search}},
  author    = {Wang, Jing and Shen, Jie},
  journal   = {Machine Learning},
  year      = {2022},
  pages     = {2297-2322},
  doi       = {10.1007/S10994-021-06124-1},
  volume    = {111},
  url       = {https://mlanthology.org/mlj/2022/wang2022mlj-fast/}
}