Efficient Indexing of Billion-Scale Datasets of Deep Descriptors

Abstract

Existing billion-scale nearest neighbor search systems have mostly been compared on a single dataset of a billion of SIFT vectors, where systems based on the Inverted Multi-Index (IMI) have been performing very well, achieving state-of-the-art recall in several milliseconds. SIFT-like descriptors, however, are quickly being replaced with descriptors based on deep neural networks (DNN) that provide better performance for many computer vision tasks. In this paper, we introduce a new dataset of one billion descriptors based on DNNs and reveal the relative inefficiency of IMI-based indexing for such descriptors compared to SIFT data. We then introduce two new indexing structures, the Non-Orthogonal Inverted Multi-Index (NO-IMI) and the Generalized Non-Orthogonal Inverted Multi-Index (GNO-IMI). We show that due to additional flexibility, the new structures are able to adapt to DNN descriptor distribution in a better way. In particular, extensive experiments on the new dataset demonstrate that these data structures provide considerably better trade-off between the speed of retrieval and recall, given similar amount of memory, as compared to the standard Inverted Multi-Index.

Cite

Text

Babenko and Lempitsky. "Efficient Indexing of Billion-Scale Datasets of Deep Descriptors." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.226

Markdown

[Babenko and Lempitsky. "Efficient Indexing of Billion-Scale Datasets of Deep Descriptors." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/babenko2016cvpr-efficient/) doi:10.1109/CVPR.2016.226

BibTeX

@inproceedings{babenko2016cvpr-efficient,
  title     = {{Efficient Indexing of Billion-Scale Datasets of Deep Descriptors}},
  author    = {Babenko, Artem and Lempitsky, Victor},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.226},
  url       = {https://mlanthology.org/cvpr/2016/babenko2016cvpr-efficient/}
}