A Scalable Framework for Discovering Coherent Co-Clusters in Noisy Data
Abstract
Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data analysis only a subset of the genes show cohesive expressions within a subset of the conditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for coherent and meaningful clusters from such datasets. Additionally, since clusters could exist in different subspaces of the feature space, a co-clustering algorithm that simultaneously clusters objects and features is often more suitable as compared to one that is restricted to traditional ``one-sided'' clustering. We propose Robust Overlapping Co-clustering (ROCC), a scalable and very versatile framework that addresses the problem of efficiently mining dense, arbitrarily positioned, possibly overlapping co-clusters from large, noisy datasets. ROCC has several desirable properties that make it extremely well suited to a number of real life applications.
Cite
Text
Deodhar et al. "A Scalable Framework for Discovering Coherent Co-Clusters in Noisy Data." International Conference on Machine Learning, 2009. doi:10.1145/1553374.1553405Markdown
[Deodhar et al. "A Scalable Framework for Discovering Coherent Co-Clusters in Noisy Data." International Conference on Machine Learning, 2009.](https://mlanthology.org/icml/2009/deodhar2009icml-scalable/) doi:10.1145/1553374.1553405BibTeX
@inproceedings{deodhar2009icml-scalable,
title = {{A Scalable Framework for Discovering Coherent Co-Clusters in Noisy Data}},
author = {Deodhar, Meghana and Gupta, Gunjan and Ghosh, Joydeep and Cho, Hyuk and Dhillon, Inderjit S.},
booktitle = {International Conference on Machine Learning},
year = {2009},
pages = {241-248},
doi = {10.1145/1553374.1553405},
url = {https://mlanthology.org/icml/2009/deodhar2009icml-scalable/}
}