Fast Spectral Analysis for Approximate Nearest Neighbor Search
Abstract
In large-scale machine learning, of central interest is the problem of approximate nearest neighbor (ANN) search, where the goal is to query particular points that are close to a given object under certain metric. In this paper, we develop a novel data-driven ANN search algorithm where the data structure is learned by fast spectral technique based on s landmarks selected by approximate ridge leverage scores. We show that with overwhelming probability, our algorithm returns the $(1+\epsilon /4)$ ( 1 + ϵ / 4 ) -ANN for any approximation parameter $\epsilon \in (0,1)$ ϵ ∈ ( 0 , 1 ) . A remarkable feature of our algorithm is that it is computationally efficient. Specifically, learning k -length hash codes requires $O((s^3+ns^2)\log n)$ O ( ( s 3 + n s 2 ) log n ) running time and $O(d^2)$ O ( d 2 ) extra space, and returning the $(1+\epsilon /4)$ ( 1 + ϵ / 4 ) -ANN of the query needs $O(k\log n)$ O ( k log n ) running time. The experimental results on computer vision and natural language understanding tasks demonstrate the significant advantage of our algorithm compared to state-of-the-art methods.
Cite
Text
Wang and Shen. "Fast Spectral Analysis for Approximate Nearest Neighbor Search." Machine Learning, 2022. doi:10.1007/S10994-021-06124-1Markdown
[Wang and Shen. "Fast Spectral Analysis for Approximate Nearest Neighbor Search." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/wang2022mlj-fast/) doi:10.1007/S10994-021-06124-1BibTeX
@article{wang2022mlj-fast,
title = {{Fast Spectral Analysis for Approximate Nearest Neighbor Search}},
author = {Wang, Jing and Shen, Jie},
journal = {Machine Learning},
year = {2022},
pages = {2297-2322},
doi = {10.1007/S10994-021-06124-1},
volume = {111},
url = {https://mlanthology.org/mlj/2022/wang2022mlj-fast/}
}