Efficient Discovery of Sets of Co-Occurring Items in Event Sequences

Abstract

Discovering patterns in long event sequences is an important data mining task. Most existing work focuses on frequency-based quality measures that allow algorithms to use the anti-monotonicity property to prune the search space and efficiently discover the most frequent patterns. In this work, we step away from such measures, and evaluate patterns using cohesion—a measure of how close to each other the items making up the pattern appear in the sequence on average. We tackle the fact that cohesion is not an anti-monotonic measure by developing a novel pruning technique in order to reduce the search space. By doing so, we are able to efficiently unearth rare, but strongly cohesive, patterns that existing methods often fail to discover. The data and software related to this paper are available at https://bitbucket.org/len_feremans/sequencepatternmining_public .

Cite

Text

Cule et al. "Efficient Discovery of Sets of Co-Occurring Items in Event Sequences." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46128-1_23

Markdown

[Cule et al. "Efficient Discovery of Sets of Co-Occurring Items in Event Sequences." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/cule2016ecmlpkdd-efficient/) doi:10.1007/978-3-319-46128-1_23

BibTeX

@inproceedings{cule2016ecmlpkdd-efficient,
  title     = {{Efficient Discovery of Sets of Co-Occurring Items in Event Sequences}},
  author    = {Cule, Boris and Feremans, Len and Goethals, Bart},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2016},
  pages     = {361-377},
  doi       = {10.1007/978-3-319-46128-1_23},
  url       = {https://mlanthology.org/ecmlpkdd/2016/cule2016ecmlpkdd-efficient/}
}