Learning to Design Data-Structures: A Case Study of Nearest Neighbor Search

Abstract

We propose a general framework for automating data-structure design and apply it to the problem of nearest neighbor search. Our model adapts to the underlying data distribution and provides fine-grained control over query and space complexity, enabling the discovery of solutions tailored to problem-specific constraints. We are able to reverse-engineer learned algorithms in several settings. In 1D, the model discovers optimal distribution (in)dependent algorithms such as binary search and variants of interpolation search. In higher dimensions, the model learns solutions that resemble K-d trees in some regimes, while in others, have elements of locality-sensitive hashing.

Cite

Text

Salemohamed et al. "Learning to Design Data-Structures: A Case Study of Nearest Neighbor Search." ICML 2024 Workshops: Differentiable_Almost_Everything, 2024.

Markdown

[Salemohamed et al. "Learning to Design Data-Structures: A Case Study of Nearest Neighbor Search." ICML 2024 Workshops: Differentiable_Almost_Everything, 2024.](https://mlanthology.org/icmlw/2024/salemohamed2024icmlw-learning/)

BibTeX

@inproceedings{salemohamed2024icmlw-learning,
  title     = {{Learning to Design Data-Structures: A Case Study of Nearest Neighbor Search}},
  author    = {Salemohamed, Omar and Sharan, Vatsal and Garg, Shivam and Charlin, Laurent and Valiant, Gregory},
  booktitle = {ICML 2024 Workshops: Differentiable_Almost_Everything},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/salemohamed2024icmlw-learning/}
}