Cluster Identification in Nearest-Neighbor Graphs

Abstract

Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are “identified”: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict “optimal” values of k for the mutual and symmetric k -nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

Cite

Text

Maier et al. "Cluster Identification in Nearest-Neighbor Graphs." International Conference on Algorithmic Learning Theory, 2007. doi:10.1007/978-3-540-75225-7_18

Markdown

[Maier et al. "Cluster Identification in Nearest-Neighbor Graphs." International Conference on Algorithmic Learning Theory, 2007.](https://mlanthology.org/alt/2007/maier2007alt-cluster/) doi:10.1007/978-3-540-75225-7_18

BibTeX

@inproceedings{maier2007alt-cluster,
  title     = {{Cluster Identification in Nearest-Neighbor Graphs}},
  author    = {Maier, Markus and Hein, Matthias and von Luxburg, Ulrike},
  booktitle = {International Conference on Algorithmic Learning Theory},
  year      = {2007},
  pages     = {196-210},
  doi       = {10.1007/978-3-540-75225-7_18},
  url       = {https://mlanthology.org/alt/2007/maier2007alt-cluster/}
}