Approximate Nearest Neighbor Search on HDD

Abstract

Nearest Neighbor (NN) search plays important roles in Computer Vision algorithms. Especially, NN search on immensely large amount of image data set stored on the Internet is getting highlighted. For dealing with such huge data, main memory of a single PC is insufficient. As a solution, we propose an approximate NN search on hard disk drive (HDD) in this paper. This algorithm is based on recently proposed Principal Component Hashing (PCH). In our algorithm ¿PCH on HDD¿ (PCHD), the hash bins are represented by the leaf nodes of B+ tree for dealing with the dynamic addition and deletion of the data. Of course, the search time is slower than the original PCH. However, we found some advantages of this approach through the experiments using standard PC and 10000 stored images: 1) the memory consumption is 42 times smaller, 2) the first search time including the cold start-up time is 4.3 times faster (PCH:31.8[s], PCHD: 7.4[s]), 3) and interestingly, the successive searches are accelerated owing to the cache mechanism embedded in the operating system (mean search time decreases from 7.4[s] to 0.64[s]). We also confirmed that our algorithm performs NN search on 1 million image datasets with only 193MB memory consumption; however, PCH cannot, because of the huge memory consumption. These properties reveal that this algorithm is suitable for non-time-critical NN search applications and NN search engine called by web servers, where the search engine starts up in response to occasional queries.

Cite

Text

Himei and Wada. "Approximate Nearest Neighbor Search on HDD." IEEE/CVF International Conference on Computer Vision Workshops, 2009. doi:10.1109/ICCVW.2009.5457540

Markdown

[Himei and Wada. "Approximate Nearest Neighbor Search on HDD." IEEE/CVF International Conference on Computer Vision Workshops, 2009.](https://mlanthology.org/iccvw/2009/himei2009iccvw-approximate/) doi:10.1109/ICCVW.2009.5457540

BibTeX

@inproceedings{himei2009iccvw-approximate,
  title     = {{Approximate Nearest Neighbor Search on HDD}},
  author    = {Himei, Noritaka and Wada, Toshikazu},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2009},
  pages     = {2101-2108},
  doi       = {10.1109/ICCVW.2009.5457540},
  url       = {https://mlanthology.org/iccvw/2009/himei2009iccvw-approximate/}
}