Correlation Clustering with Asymmetric Classification Errors

Abstract

In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labelled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study the correlation clustering problem under the following assumption: Every "similar" edge $e$ has weight $w_e \in [ \alpha w, w ]$ and every "dissimilar" edge $e$ has weight $w_e \geq \alpha w$ (where $\alpha \leq 1$ and $w > 0$ is a scaling parameter). We give a $(3 + 2 \log_e (1/\alpha))$ approximation algorithm for this problem. This assumption captures well the scenario when classification errors are asymmetric. Additionally, we show an asymptotically matching Linear Programming integrality gap of $\Omega(\log 1/\alpha)$.

Cite

Text

Jafarov et al. "Correlation Clustering with Asymmetric Classification Errors." International Conference on Machine Learning, 2020.

Markdown

[Jafarov et al. "Correlation Clustering with Asymmetric Classification Errors." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/jafarov2020icml-correlation/)

BibTeX

@inproceedings{jafarov2020icml-correlation,
  title     = {{Correlation Clustering with Asymmetric Classification Errors}},
  author    = {Jafarov, Jafar and Kalhan, Sanchit and Makarychev, Konstantin and Makarychev, Yury},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {4641-4650},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/jafarov2020icml-correlation/}
}