Proximity Graphs for Clustering and Manifold Learning

Abstract

Many machine learning algorithms for clustering or dimensionality re- duction take as input a cloud of points in Euclidean space, and construct a graph with the input data points as vertices. This graph is then parti- tioned (clustering) or used to redefine metric information (dimensional- ity reduction). There has been much recent work on new methods for graph-based clustering and dimensionality reduction, but not much on constructing the graph itself. Graphs typically used include the fully- connected graph, a local fixed-grid graph (for image segmentation) or a nearest-neighbor graph. We suggest that the graph should adapt locally to the structure of the data. This can be achieved by a graph ensemble that combines multiple minimum spanning trees, each fit to a perturbed version of the data set. We show that such a graph ensemble usually pro- duces a better representation of the data manifold than standard methods; and that it provides robustness to a subsequent clustering or dimension- ality reduction algorithm based on the graph.

Cite

Text

Zemel and Carreira-Perpiñán. "Proximity Graphs for Clustering and Manifold Learning." Neural Information Processing Systems, 2004.

Markdown

[Zemel and Carreira-Perpiñán. "Proximity Graphs for Clustering and Manifold Learning." Neural Information Processing Systems, 2004.](https://mlanthology.org/neurips/2004/zemel2004neurips-proximity/)

BibTeX

@inproceedings{zemel2004neurips-proximity,
  title     = {{Proximity Graphs for Clustering and Manifold Learning}},
  author    = {Zemel, Richard S. and Carreira-Perpiñán, Miguel Á.},
  booktitle = {Neural Information Processing Systems},
  year      = {2004},
  pages     = {225-232},
  url       = {https://mlanthology.org/neurips/2004/zemel2004neurips-proximity/}
}