A Split-Merge DP-Means Algorithm to Avoid Local Minima
Abstract
We present an extension of the DP-means algorithm, a hard-clustering approximation of nonparametric Bayesian models. Although a recent work [ 6 ] reports that the DP-means can converge to a local minimum, the condition for the DP-means to converge to a local minimum is still unknown. This paper demonstrates one reason the DP-means converges to a local minimum: the DP-means cannot assign the optimal number of clusters when many data points exist within small distances . As a first attempt to avoid the local minimum, we propose an extension of the DP-means by the split-merge technique. The proposed algorithm splits clusters when a cluster has many data points to assign the number of clusters near to optimal. The experimental results with multiple datasets show the robustness of the proposed algorithm.
Cite
Text
Odashima et al. "A Split-Merge DP-Means Algorithm to Avoid Local Minima." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_5Markdown
[Odashima et al. "A Split-Merge DP-Means Algorithm to Avoid Local Minima." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/odashima2016ecmlpkdd-splitmerge/) doi:10.1007/978-3-319-46227-1_5BibTeX
@inproceedings{odashima2016ecmlpkdd-splitmerge,
title = {{A Split-Merge DP-Means Algorithm to Avoid Local Minima}},
author = {Odashima, Shigeyuki and Ueki, Miwa and Sawasaki, Naoyuki},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2016},
pages = {63-78},
doi = {10.1007/978-3-319-46227-1_5},
url = {https://mlanthology.org/ecmlpkdd/2016/odashima2016ecmlpkdd-splitmerge/}
}