Proximity Graphs for Clustering and Manifold Learning
Abstract
Many machine learning algorithms for clustering or dimensionality re- duction take as input a cloud of points in Euclidean space, and construct a graph with the input data points as vertices. This graph is then parti- tioned (clustering) or used to redefine metric information (dimensional- ity reduction). There has been much recent work on new methods for graph-based clustering and dimensionality reduction, but not much on constructing the graph itself. Graphs typically used include the fully- connected graph, a local fixed-grid graph (for image segmentation) or a nearest-neighbor graph. We suggest that the graph should adapt locally to the structure of the data. This can be achieved by a graph ensemble that combines multiple minimum spanning trees, each fit to a perturbed version of the data set. We show that such a graph ensemble usually pro- duces a better representation of the data manifold than standard methods; and that it provides robustness to a subsequent clustering or dimension- ality reduction algorithm based on the graph.
Cite
Text
Zemel and Carreira-Perpiñán. "Proximity Graphs for Clustering and Manifold Learning." Neural Information Processing Systems, 2004.Markdown
[Zemel and Carreira-Perpiñán. "Proximity Graphs for Clustering and Manifold Learning." Neural Information Processing Systems, 2004.](https://mlanthology.org/neurips/2004/zemel2004neurips-proximity/)BibTeX
@inproceedings{zemel2004neurips-proximity,
title = {{Proximity Graphs for Clustering and Manifold Learning}},
author = {Zemel, Richard S. and Carreira-Perpiñán, Miguel Á.},
booktitle = {Neural Information Processing Systems},
year = {2004},
pages = {225-232},
url = {https://mlanthology.org/neurips/2004/zemel2004neurips-proximity/}
}