Dependent Randomized Rounding for Clustering and Partition Systems with Knapsack Constraints

David G. Harris, Thomas Pensyl, Aravind Srinivasan, Khoa Trinh

JMLR 2022 pp. 1-41

/jmlr/2022/harris2022jmlr-dependent/

Abstract

Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on fairness in machine learning and AI; one representative notion of fairness is that no single group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with “knapsack" and “partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds can be useful in inferring properties of large networks using few samples.

PDF JMLR Semantic Scholar

Cite

Text

Harris et al. "Dependent Randomized Rounding for Clustering and Partition Systems with Knapsack Constraints." Journal of Machine Learning Research, 2022.

Markdown

[Harris et al. "Dependent Randomized Rounding for Clustering and Partition Systems with Knapsack Constraints." Journal of Machine Learning Research, 2022.](https://mlanthology.org/jmlr/2022/harris2022jmlr-dependent/)

BibTeX

@article{harris2022jmlr-dependent,
  title     = {{Dependent Randomized Rounding for Clustering and Partition Systems with Knapsack Constraints}},
  author    = {Harris, David G. and Pensyl, Thomas and Srinivasan, Aravind and Trinh, Khoa},
  journal   = {Journal of Machine Learning Research},
  year      = {2022},
  pages     = {1-41},
  volume    = {23},
  url       = {https://mlanthology.org/jmlr/2022/harris2022jmlr-dependent/}
}