A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files

Abstract

In system management applications, to perform automated analysis of the historical data across multiple components when problems occur, we need to cluster the log messages with disparate formats to automatically infer the common set of semantic situations and obtain a brief description for each situation. In this paper, we propose a clustering model where the problem of clustering is formulated as matrix approximations and the clustering objective is minimizing the approximation error between the original data matrix and the reconstructed matrix based on the cluster structures. The model explicitly characterizes the data and feature memberships and thus enables the descriptions of each cluster. We present a two-side spectral relaxation optimization procedure for the clustering model. We also establish the connections between our clustering model with existing approaches. Experimental results show the effectiveness of the proposed approach.

Cite

Text

Li and Peng. "A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files." European Conference on Machine Learning, 2005. doi:10.1007/11564096_62

Markdown

[Li and Peng. "A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files." European Conference on Machine Learning, 2005.](https://mlanthology.org/ecmlpkdd/2005/li2005ecml-clustering/) doi:10.1007/11564096_62

BibTeX

@inproceedings{li2005ecml-clustering,
  title     = {{A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files}},
  author    = {Li, Tao and Peng, Wei},
  booktitle = {European Conference on Machine Learning},
  year      = {2005},
  pages     = {625-632},
  doi       = {10.1007/11564096_62},
  url       = {https://mlanthology.org/ecmlpkdd/2005/li2005ecml-clustering/}
}