Bayesian Active Clustering with Pairwise Constraints

Abstract

Clustering can be improved with pairwise constraints that specify similarities between pairs of instances. However, randomly selecting constraints could lead to the waste of labeling effort, or even degrade the clustering performance. Consequently, how to actively select effective pairwise constraints to improve clustering becomes an important problem, which is the focus of this paper. In this work, we introduce a Bayesian clustering model that learns from pairwise constraints. With this model, we present an active learning framework that iteratively selects the most informative pair of instances to query an oracle, and updates the model posterior based on the obtained pairwise constraints. We introduce two information-theoretic criteria for selecting informative pairs. One selects the pair with the most uncertainty, and the other chooses the pair that maximizes the marginal information gain about the clustering. Experiments on benchmark datasets demonstrate the effectiveness of the proposed method over state-of-the-art.

Cite

Text

Pei et al. "Bayesian Active Clustering with Pairwise Constraints." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015. doi:10.1007/978-3-319-23528-8_15

Markdown

[Pei et al. "Bayesian Active Clustering with Pairwise Constraints." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015.](https://mlanthology.org/ecmlpkdd/2015/pei2015ecmlpkdd-bayesian/) doi:10.1007/978-3-319-23528-8_15

BibTeX

@inproceedings{pei2015ecmlpkdd-bayesian,
  title     = {{Bayesian Active Clustering with Pairwise Constraints}},
  author    = {Pei, Yuanli and Liu, Li-Ping and Fern, Xiaoli Z.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2015},
  pages     = {235-250},
  doi       = {10.1007/978-3-319-23528-8_15},
  url       = {https://mlanthology.org/ecmlpkdd/2015/pei2015ecmlpkdd-bayesian/}
}