Efficient Mining of Statistical Dependencies
Abstract
The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data (Oates & Cohen 1996c). Rule strength is measured by the G statistic (Wickens 1989), and an upper bound on the value of G for the descendants of a node allows msdd's search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of msdd's search space based on information collected during the search process. Second, we discuss an implementation of msdd that distributes its computations over multiple machines on a network. 1 Introduction Multi-Stream Dependency Detection (msdd) is an algorithm for finding rules that capture statistical dependencies in databases. Past applications of the algorithm include finding dependencies in multi-variate t...
Cite
Text
Oates et al. "Efficient Mining of Statistical Dependencies." International Joint Conference on Artificial Intelligence, 1999.Markdown
[Oates et al. "Efficient Mining of Statistical Dependencies." International Joint Conference on Artificial Intelligence, 1999.](https://mlanthology.org/ijcai/1999/oates1999ijcai-efficient/)BibTeX
@inproceedings{oates1999ijcai-efficient,
title = {{Efficient Mining of Statistical Dependencies}},
author = {Oates, Tim and Schmill, Matthew D. and Cohen, Paul R.},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1999},
pages = {794-799},
url = {https://mlanthology.org/ijcai/1999/oates1999ijcai-efficient/}
}