A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files
Abstract
In system management applications, to perform automated analysis of the historical data across multiple components when problems occur, we need to cluster the log messages with disparate formats to automatically infer the common set of semantic situations and obtain a brief description for each situation. In this paper, we propose a clustering model where the problem of clustering is formulated as matrix approximations and the clustering objective is minimizing the approximation error between the original data matrix and the reconstructed matrix based on the cluster structures. The model explicitly characterizes the data and feature memberships and thus enables the descriptions of each cluster. We present a two-side spectral relaxation optimization procedure for the clustering model. We also establish the connections between our clustering model with existing approaches. Experimental results show the effectiveness of the proposed approach.
Cite
Text
Li and Peng. "A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files." European Conference on Machine Learning, 2005. doi:10.1007/11564096_62Markdown
[Li and Peng. "A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files." European Conference on Machine Learning, 2005.](https://mlanthology.org/ecmlpkdd/2005/li2005ecml-clustering/) doi:10.1007/11564096_62BibTeX
@inproceedings{li2005ecml-clustering,
title = {{A Clustering Model Based on Matrix Approximation with Applications to Cluster System Log Files}},
author = {Li, Tao and Peng, Wei},
booktitle = {European Conference on Machine Learning},
year = {2005},
pages = {625-632},
doi = {10.1007/11564096_62},
url = {https://mlanthology.org/ecmlpkdd/2005/li2005ecml-clustering/}
}