Bayesian Active Clustering with Pairwise Constraints
Abstract
Clustering can be improved with pairwise constraints that specify similarities between pairs of instances. However, randomly selecting constraints could lead to the waste of labeling effort, or even degrade the clustering performance. Consequently, how to actively select effective pairwise constraints to improve clustering becomes an important problem, which is the focus of this paper. In this work, we introduce a Bayesian clustering model that learns from pairwise constraints. With this model, we present an active learning framework that iteratively selects the most informative pair of instances to query an oracle, and updates the model posterior based on the obtained pairwise constraints. We introduce two information-theoretic criteria for selecting informative pairs. One selects the pair with the most uncertainty, and the other chooses the pair that maximizes the marginal information gain about the clustering. Experiments on benchmark datasets demonstrate the effectiveness of the proposed method over state-of-the-art.
Cite
Text
Pei et al. "Bayesian Active Clustering with Pairwise Constraints." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015. doi:10.1007/978-3-319-23528-8_15Markdown
[Pei et al. "Bayesian Active Clustering with Pairwise Constraints." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015.](https://mlanthology.org/ecmlpkdd/2015/pei2015ecmlpkdd-bayesian/) doi:10.1007/978-3-319-23528-8_15BibTeX
@inproceedings{pei2015ecmlpkdd-bayesian,
title = {{Bayesian Active Clustering with Pairwise Constraints}},
author = {Pei, Yuanli and Liu, Li-Ping and Fern, Xiaoli Z.},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2015},
pages = {235-250},
doi = {10.1007/978-3-319-23528-8_15},
url = {https://mlanthology.org/ecmlpkdd/2015/pei2015ecmlpkdd-bayesian/}
}