Discrete Model-Based Clustering with Overlapping Subsets of Attributes
Abstract
Traditional model-based clustering methods assume that data instances can be grouped in a single “best" way. This is often untrue for complex data, where several meaningful sets of clusters may exist, each of them associated to a unique subset of data attributes. Current literature has approached this problem with models that consider disjoint subsets of attributes to define distinct clustering solutions. Each solution being represented by a cluster variable. However, restricting attributes to a single cluster variable diminishes the expressiveness and quality of these models. For this reason, we propose a novel kind of models that allows cluster variables to have overlapping subsets of attributes between them. In order to learn these models, we propose to combine a search-based method with an attribute clustering procedure. Experimental results with both synthetic and real-world data show the utility of our approach and its competitiveness with the state-of-the-art.
Cite
Text
Rodriguez-Sanchez et al. "Discrete Model-Based Clustering with Overlapping Subsets of Attributes." Proceedings of the Ninth International Conference on Probabilistic Graphical Models, 2018.Markdown
[Rodriguez-Sanchez et al. "Discrete Model-Based Clustering with Overlapping Subsets of Attributes." Proceedings of the Ninth International Conference on Probabilistic Graphical Models, 2018.](https://mlanthology.org/pgm/2018/rodriguezsanchez2018pgm-discrete/)BibTeX
@inproceedings{rodriguezsanchez2018pgm-discrete,
title = {{Discrete Model-Based Clustering with Overlapping Subsets of Attributes}},
author = {Rodriguez-Sanchez, Fernando and Larrañaga, Pedro and Bielza, Concha},
booktitle = {Proceedings of the Ninth International Conference on Probabilistic Graphical Models},
year = {2018},
pages = {392-403},
volume = {72},
url = {https://mlanthology.org/pgm/2018/rodriguezsanchez2018pgm-discrete/}
}