Randomized Algorithms for Comparison-Based Search
Abstract
This paper addresses the problem of finding the nearest neighbor (or one of the $R$-nearest neighbors) of a query object $q$ in a database of $n$ objects, when we can only use a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object most similar to the query object. The main problem we study is how to search the database for the nearest neighbor (NN) of a query, while minimizing the questions. The difficulty of this problem depends on properties of the underlying database. We show the importance of a characterization: \emph{combinatorial disorder} $D$ which defines approximate triangle inequalities on ranks. We present a lower bound of $\Omega(D\log \frac{n}{D}+D^2)$ average number of questions in the search phase for any randomized algorithm, which demonstrates the fundamental role of $D$ for worst case behavior. We develop a randomized scheme for NN retrieval in $O(D^3\log^2 n+ D\log^2 n \log\log n^{D^3})$ questions. The learning requires asking $O(n D^3\log^2 n+ D \log^2 n \log\log n^{D^3})$ questions and $O(n\log^2n/\log(2D))$ bits to store.
Cite
Text
Tschopp et al. "Randomized Algorithms for Comparison-Based Search." Neural Information Processing Systems, 2011.Markdown
[Tschopp et al. "Randomized Algorithms for Comparison-Based Search." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/tschopp2011neurips-randomized/)BibTeX
@inproceedings{tschopp2011neurips-randomized,
title = {{Randomized Algorithms for Comparison-Based Search}},
author = {Tschopp, Dominique and Diggavi, Suhas and Delgosha, Payam and Mohajer, Soheil},
booktitle = {Neural Information Processing Systems},
year = {2011},
pages = {2231-2239},
url = {https://mlanthology.org/neurips/2011/tschopp2011neurips-randomized/}
}