Copula PC Algorithm for Causal Discovery from Mixed Data
Abstract
We propose the ‘Copula PC’ algorithm for causal discovery from a combination of continuous and discrete data, assumed to be drawn from a Gaussian copula model. It is based on a two-step approach. The first step applies Gibbs sampling on rank-based data to obtain samples of correlation matrices. These are then translated into an average correlation matrix and an effective number of data points, which in the second step are input to the standard PC algorithm for causal discovery. A stable version naturally arises when rerunning the PC algorithm on different Gibbs samples. Our ‘Copula PC’ algorithm extends the ‘Rank PC’ algorithm, which has been designed for Gaussian copula models for purely continuous data. In simulations, ‘Copula PC’ indeed outperforms ‘Rank PC’ in cases with mixed variables, in particular for larger numbers of data points, at the expense of a slight increase in computation time.
Cite
Text
Cui et al. "Copula PC Algorithm for Causal Discovery from Mixed Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_24Markdown
[Cui et al. "Copula PC Algorithm for Causal Discovery from Mixed Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/cui2016ecmlpkdd-copula/) doi:10.1007/978-3-319-46227-1_24BibTeX
@inproceedings{cui2016ecmlpkdd-copula,
title = {{Copula PC Algorithm for Causal Discovery from Mixed Data}},
author = {Cui, Ruifei and Groot, Perry and Heskes, Tom},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2016},
pages = {377-392},
doi = {10.1007/978-3-319-46227-1_24},
url = {https://mlanthology.org/ecmlpkdd/2016/cui2016ecmlpkdd-copula/}
}