Comparing Clusterings in Space

Abstract

This paper proposes a new method for comparing clusterings both partitionally and geometrically. Our approach is motivated by the following observation: the vast majority of previous techniques for comparing clusterings are entirely partitional, i.e., they examine assignments of points in set theoretic terms after they have been partitioned. In doing so, these methods ignore the spatial layout of the data, disregarding the fact that this information is responsible for generating the clusterings to begin with. We demonstrate that this leads to a variety of failure modes. Previous comparison techniques often fail to differentiate between significant changes made in data being clustered. We formulate a new measure for comparing clusterings that combines spatial and partitional information into a single measure using optimization theory. Doing so eliminates pathological conditions in previous approaches. It also simultaneously removes common limitations, such as that each clustering must have the same number of clusters or the yare over identical datasets. This approach is stable, easily implemented, and has strong intuitive appeal.

Cite

Text

Coen et al. "Comparing Clusterings in Space." International Conference on Machine Learning, 2010.

Markdown

[Coen et al. "Comparing Clusterings in Space." International Conference on Machine Learning, 2010.](https://mlanthology.org/icml/2010/coen2010icml-comparing/)

BibTeX

@inproceedings{coen2010icml-comparing,
  title     = {{Comparing Clusterings in Space}},
  author    = {Coen, Michael H. and Ansari, M. Hidayath and Fillmore, Nathanael},
  booktitle = {International Conference on Machine Learning},
  year      = {2010},
  pages     = {231-238},
  url       = {https://mlanthology.org/icml/2010/coen2010icml-comparing/}
}