LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table

Abstract

Approximate nearest neighbor search (ANNS) is an essential building block for applications like RAG but can sometimes yield results that are overly similar to each other. In certain scenarios, search results should be similar to the query and yet diverse. We propose LotusFilter, a post-processing module to diversify ANNS results. We precompute a cutoff table summarizing vectors that are close to each other. During the filtering, LotusFilter greedily looks up the table to delete redundant vectors from the candidates. We demonstrated that the LotusFilter operates fast (0.02 [ms/query]) in settings resembling real-world RAG applications, utilizing features such as OpenAI embeddings. Our code is publicly available at https://github.com/matsui528/lotf.

Cite

Text

Matsui. "LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02833

Markdown

[Matsui. "LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/matsui2025cvpr-lotusfilter/) doi:10.1109/CVPR52734.2025.02833

BibTeX

@inproceedings{matsui2025cvpr-lotusfilter,
  title     = {{LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table}},
  author    = {Matsui, Yusuke},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {30430-30439},
  doi       = {10.1109/CVPR52734.2025.02833},
  url       = {https://mlanthology.org/cvpr/2025/matsui2025cvpr-lotusfilter/}
}