StreamKrimp: Detecting Change in Data Streams
Abstract
Data streams are ubiquitous. Examples range from sensor networks to financial transactions and website logs. In fact, even market basket data can be seen as a stream of sales. Detecting changes in the distribution a stream is sampled from is one of the most challenging problems in stream mining, as only limited storage can be used. In this paper we analyse this problem for streams of transaction data from an MDL perspective. Based on this analysis we introduce the StreamKrimp algorithm, whichuses the Krimp algorithm to characterise probability distributions with code tables. With these code tables, StreamKrimp partitions the stream into a sequence of substreams. Each switch of code table indicates a change in the underlying distribution. Experiments on both real and artificial streams show that StreamKrimp detects the changes while using only a very limited amount of data storage.
Cite
Text
van Leeuwen and Siebes. "StreamKrimp: Detecting Change in Data Streams." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87479-9_62Markdown
[van Leeuwen and Siebes. "StreamKrimp: Detecting Change in Data Streams." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/vanleeuwen2008ecmlpkdd-streamkrimp/) doi:10.1007/978-3-540-87479-9_62BibTeX
@inproceedings{vanleeuwen2008ecmlpkdd-streamkrimp,
title = {{StreamKrimp: Detecting Change in Data Streams}},
author = {van Leeuwen, Matthijs and Siebes, Arno},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2008},
pages = {672-687},
doi = {10.1007/978-3-540-87479-9_62},
url = {https://mlanthology.org/ecmlpkdd/2008/vanleeuwen2008ecmlpkdd-streamkrimp/}
}