Fair K-Center Clustering for Data Summarization

Abstract

In data summarization we want to choose $k$ prototypes in order to summarize a data set. We study a setting where the data set comprises several demographic groups and we are restricted to choose $k_i$ prototypes belonging to group $i$. A common approach to the problem without the fairness constraint is to optimize a centroid-based clustering objective such as $k$-center. A natural extension then is to incorporate the fairness constraint into the clustering problem. Existing algorithms for doing so run in time super-quadratic in the size of the data set, which is in contrast to the standard $k$-center problem being approximable in linear time. In this paper, we resolve this gap by providing a simple approximation algorithm for the $k$-center problem under the fairness constraint with running time linear in the size of the data set and $k$. If the number of demographic groups is small, the approximation guarantee of our algorithm only incurs a constant-factor overhead.

Cite

Text

Kleindessner et al. "Fair K-Center Clustering for Data Summarization." International Conference on Machine Learning, 2019.

Markdown

[Kleindessner et al. "Fair K-Center Clustering for Data Summarization." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/kleindessner2019icml-fair/)

BibTeX

@inproceedings{kleindessner2019icml-fair,
  title     = {{Fair K-Center Clustering for Data Summarization}},
  author    = {Kleindessner, Matthäus and Awasthi, Pranjal and Morgenstern, Jamie},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {3448-3457},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/kleindessner2019icml-fair/}
}