Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density
Abstract
Mean shift clustering finds the modes of the data probability density by identifying the zero points of the density gradient. Since it does not require to fix the number of clusters in advance, the mean shift has been a popular clustering algorithm in various application fields. A typical implementation of the mean shift is to first estimate the density by kernel density estimation and then compute its gradient. However, since a good density estimation does not necessarily imply an accurate estimation of the density gradient, such an indirect two-step approach is not reliable. In this paper, we propose a method to directly estimate the gradient of the log-density without going through density estimation. The proposed method gives the global solution analytically and thus is computationally efficient. We then develop a mean-shift-like fixed-point algorithm to find the modes of the density for clustering. As in the mean shift, one does not need to set the number of clusters in advance. We experimentally show that the proposed clustering method significantly outperforms the mean shift especially for high-dimensional data.
Cite
Text
Sasaki et al. "Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014. doi:10.1007/978-3-662-44845-8_2Markdown
[Sasaki et al. "Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014.](https://mlanthology.org/ecmlpkdd/2014/sasaki2014ecmlpkdd-clustering/) doi:10.1007/978-3-662-44845-8_2BibTeX
@inproceedings{sasaki2014ecmlpkdd-clustering,
title = {{Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density}},
author = {Sasaki, Hiroaki and Hyvärinen, Aapo and Sugiyama, Masashi},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2014},
pages = {19-34},
doi = {10.1007/978-3-662-44845-8_2},
url = {https://mlanthology.org/ecmlpkdd/2014/sasaki2014ecmlpkdd-clustering/}
}