A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations
Abstract
Given a binary relation, listing the itemsets takes exponential time. The problem grows worse when searching for analog patterns defined in n -ary relations. However, real-life relations are sparse and, with a greater number n of dimensions, they tend to be even sparser. Moreover, not all itemsets are searched. Only those satisfying some userdefined constraints, such as minimal size constraints. This article proposes to exploit together the sparsity of the relation and the presence of constraints satisfying a common property, the monotonicity w.r.t. one dimension. It details a pre-processing step to identify and erase n -tuples whose removal does not change the collection of patterns to be discovered. That reduction of the relation is achieved in a time and a space that is linear in the number of n -tuples. Experiments on two real-life datasets show that, whatever the algorithm used afterward to actually list the patterns, the pre-process allows to lower the overall running time by a factor typically ranging from 10 to 100.
Cite
Text
Poesia and Cerf. "A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014. doi:10.1007/978-3-662-44851-9_37Markdown
[Poesia and Cerf. "A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014.](https://mlanthology.org/ecmlpkdd/2014/poesia2014ecmlpkdd-lossless/) doi:10.1007/978-3-662-44851-9_37BibTeX
@inproceedings{poesia2014ecmlpkdd-lossless,
title = {{A Lossless Data Reduction for Mining Constrained Patterns in N-Ary Relations}},
author = {Poesia, Gabriel and Cerf, Loïc},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2014},
pages = {581-596},
doi = {10.1007/978-3-662-44851-9_37},
url = {https://mlanthology.org/ecmlpkdd/2014/poesia2014ecmlpkdd-lossless/}
}