A Dense Subset Index for Collective Query Coverage
Abstract
In traditional information retrieval, corpus items compete with each other to occupy top ranks in response to a query. In contrast, in many recent retrieval scenarios associated with complex, multi-hop question answering or text-to-SQL, items are not self-complete: they must instead collaborate, i.e., information from multiple items must be combined to respond to the query. In the context of modern dense retrieval, this need translates into finding a small collection of corpus items whose contextual word vectors collectively cover the contextual word vectors of the query. The central challenge is to retrieve a near-optimal collection of covering items in time that is sublinear in corpus size. By establishing coverage as a submodular objective, we enable successive dense index probes to quickly assemble an item collection that achieves near-optimal coverage. Successive query vectors are iteratively `edited', and the dense index is built using random projections of a novel, lifted dense vector space. Beyond rigorous theoretical guarantees, we report on a scalable implementation of this new form of vector database. Extensive experiments establish the empirical success of DISCo, in terms of the best coverage vs. query latency tradeoffs.
Cite
Text
Nair et al. "A Dense Subset Index for Collective Query Coverage." International Conference on Learning Representations, 2026.Markdown
[Nair et al. "A Dense Subset Index for Collective Query Coverage." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/nair2026iclr-dense/)BibTeX
@inproceedings{nair2026iclr-dense,
title = {{A Dense Subset Index for Collective Query Coverage}},
author = {Nair, Kartik and Chakraborty, Pritish and Tambat, Atharva Abhijit and Roy, Indradyumna and Chakrabarti, Soumen and Dasgupta, Anirban and De, Abir},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/nair2026iclr-dense/}
}