On Finding Large Conjunctive Clusters
Abstract
We propose a new formulation of the clustering problem that differs from previous work in several aspects. First, the goal is to explicitly output a collection of simple and meaningful conjunctive descriptions of the clusters. Second, the clusters might overlap, i.e., a point can belong to multiple clusters. Third, the clusters might not cover all points, i.e., not every point is clustered. Finally, we allow a point to be assigned to a conjunctive cluster description even if it does not completely satisfy all of the attributes, but rather only satisfies most. A convenient way to view our clustering problem is that of finding a collection of large bicliques in a bipartite graph. Identifying one largest conjunctive cluster is equivalent to finding a maximum edge biclique. Since this problem is NP-hard [28] and there is evidence that it is difficult to approximate [12], we solve a relaxed version where the objective is to find a large subgraph that is close to being a biclique. We give a randomized algorithm that finds a relaxed biclique with almost as many edges as the maximum biclique. We then extend this algorithm to identify a good collection of large relaxed bicliques. A key property of these algorithms is that their running time is independent of the number of data points and linear in the number of attributes.
Cite
Text
Mishra et al. "On Finding Large Conjunctive Clusters." Annual Conference on Computational Learning Theory, 2003. doi:10.1007/978-3-540-45167-9_33Markdown
[Mishra et al. "On Finding Large Conjunctive Clusters." Annual Conference on Computational Learning Theory, 2003.](https://mlanthology.org/colt/2003/mishra2003colt-finding/) doi:10.1007/978-3-540-45167-9_33BibTeX
@inproceedings{mishra2003colt-finding,
title = {{On Finding Large Conjunctive Clusters}},
author = {Mishra, Nina and Ron, Dana and Swaminathan, Ram},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2003},
pages = {448-462},
doi = {10.1007/978-3-540-45167-9_33},
url = {https://mlanthology.org/colt/2003/mishra2003colt-finding/}
}