Optimistic Query Routing in Clustering-Based Approximate Maximum Inner Product Search

Abstract

Clustering-based nearest neighbor search algorithms partition points into shards to form an index, and search only a subset of shards to process a query. Even though search efficacy is heavily influenced by the algorithm that identifies the shards to probe, it has received little attention in the literature. We study routing in clustering-based maximum inner product search, which includes cosine similarity search. We unpack existing routers and notice the surprising role of optimism. We then take a page from the sequential decision making literature and formalize that insight following the principle of ``optimism in the face of uncertainty.'' In particular, we present a framework that incorporates the moments of the distribution of inner products within each shard to estimate the maximum inner product. We then develop a practical instance of our algorithm that uses only the first two moments to reach the same accuracy as state-of-the-art routers by probing up to $50\%$ fewer points on benchmark datasets without compromising efficiency. Our algorithm is also space-efficient: we design a sketch of the second moment whose size is independent of the number of points and requires $\mathcal{O}(1)$ vectors per shard.

Cite

Text

Bruch et al. "Optimistic Query Routing in Clustering-Based Approximate Maximum Inner Product Search." Advances in Neural Information Processing Systems, 2025.

Markdown

[Bruch et al. "Optimistic Query Routing in Clustering-Based Approximate Maximum Inner Product Search." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/bruch2025neurips-optimistic/)

BibTeX

@inproceedings{bruch2025neurips-optimistic,
  title     = {{Optimistic Query Routing in Clustering-Based Approximate Maximum Inner Product Search}},
  author    = {Bruch, Sebastian and Krishnan, Aditya and Nardini, Franco Maria},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/bruch2025neurips-optimistic/}
}