GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility
Abstract
This work studies a novel subset selection problem called *max-min diversification with monotone submodular utility* (MDMS), which has a wide range of applications in machine learning, e.g., data sampling and feature selection. Given a set of points in a metric space, the goal of MDMS is to maximize $f(S) = g(S) + \lambda \cdot \text{div}(S)$ subject to a cardinality constraint $|S| \le k$, where $g(S)$ is a monotone submodular function and $\text{div}(S) = \min_{u,v \in S : u \ne v} \text{dist}(u,v)$ is the *max-min diversity* objective. We propose the `GIST` algorithm, which gives a $\frac{1}{2}$-approximation guarantee for MDMS by approximating a series of maximum independent set problems with a bicriteria greedy algorithm. We also prove that it is NP-hard to approximate within a factor of $0.5584$. Finally, we show in our empirical study that `GIST` outperforms state-of-the-art benchmarks for a single-shot data sampling task on ImageNet.
Cite
Text
Fahrbach et al. "GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility." Advances in Neural Information Processing Systems, 2025.Markdown
[Fahrbach et al. "GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/fahrbach2025neurips-gist/)BibTeX
@inproceedings{fahrbach2025neurips-gist,
title = {{GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility}},
author = {Fahrbach, Matthew and Ramalingam, Srikumar and Zadimoghaddam, Morteza and Ahmadian, Sara and Citovsky, Gui and DeSalvo, Giulia},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/fahrbach2025neurips-gist/}
}