Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs

Abstract

Graph-based data structures have become powerful and ubiquitous tools for scalable approximate nearest-neighbor (ANN) search over the past decade. In spite of their apparent practical performance, there has only recently been progress on the worst-case performance of these data structures. Indeed, the influential work of Indyx and Xu (2023) introduced the key concept of $\alpha$-reachable graphs, showing that graphs constructed by the DiskANN algorithm (Subramanya, et. al. 2023) produce an $\left(\frac{\alpha+1}{\alpha-1}\right)$-approximate solution with a simple best-first search that runs in poly-logarithmic query time. In our work, we improve and generalize this analysis as follows: - We introduce sorted $\alpha$-reachable graphs, and use this notion to obtain a stronger approximation factor of $\frac{\alpha}{\alpha-1}$ for the DiskANN algorithm on Euclidean metrics. - We present the first worst-case theoretical analysis for the popular beam-search algorithm, which is used in practice to search these graphs for $k > 1$ candidate nearest neighbors. We also present empirical results validating the significance of sorted $\alpha$-reachable graphs, which aligns with our theoretical findings.

Cite

Text

Gollapudi et al. "Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Gollapudi et al. "Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/gollapudi2025icml-sort/)

BibTeX

@inproceedings{gollapudi2025icml-sort,
  title     = {{Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs}},
  author    = {Gollapudi, Siddharth and Krishnaswamy, Ravishankar and Shiragur, Kirankumar and Wardhan, Harsh},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {19787-19808},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/gollapudi2025icml-sort/}
}