A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations

Abstract

Given a binary relation, listing the itemsets takes exponential time. The problem grows worse when searching for analog patterns defined in n -ary relations. However, real-life relations are sparse and, with a greater number n of dimensions, they tend to be even sparser. Moreover, not all itemsets are searched. Only those satisfying some userdefined constraints, such as minimal size constraints. This article proposes to exploit together the sparsity of the relation and the presence of constraints satisfying a common property, the monotonicity w.r.t. one dimension. It details a pre-processing step to identify and erase n -tuples whose removal does not change the collection of patterns to be discovered. That reduction of the relation is achieved in a time and a space that is linear in the number of n -tuples. Experiments on two real-life datasets show that, whatever the algorithm used afterward to actually list the patterns, the pre-process allows to lower the overall running time by a factor typically ranging from 10 to 100.

Cite

Text

Poesia and Cerf. "A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014. doi:10.1007/978-3-662-44851-9_37

Markdown

[Poesia and Cerf. "A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014.](https://mlanthology.org/ecmlpkdd/2014/poesia2014ecmlpkdd-lossless/) doi:10.1007/978-3-662-44851-9_37

BibTeX

@inproceedings{poesia2014ecmlpkdd-lossless,
  title     = {{A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations}},
  author    = {Poesia, Gabriel and Cerf, Loïc},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2014},
  pages     = {581-596},
  doi       = {10.1007/978-3-662-44851-9_37},
  url       = {https://mlanthology.org/ecmlpkdd/2014/poesia2014ecmlpkdd-lossless/}
}