Copula PC Algorithm for Causal Discovery from Mixed Data

Abstract

We propose the ‘Copula PC’ algorithm for causal discovery from a combination of continuous and discrete data, assumed to be drawn from a Gaussian copula model. It is based on a two-step approach. The first step applies Gibbs sampling on rank-based data to obtain samples of correlation matrices. These are then translated into an average correlation matrix and an effective number of data points, which in the second step are input to the standard PC algorithm for causal discovery. A stable version naturally arises when rerunning the PC algorithm on different Gibbs samples. Our ‘Copula PC’ algorithm extends the ‘Rank PC’ algorithm, which has been designed for Gaussian copula models for purely continuous data. In simulations, ‘Copula PC’ indeed outperforms ‘Rank PC’ in cases with mixed variables, in particular for larger numbers of data points, at the expense of a slight increase in computation time.

Cite

Text

Cui et al. "Copula PC Algorithm for Causal Discovery from Mixed Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_24

Markdown

[Cui et al. "Copula PC Algorithm for Causal Discovery from Mixed Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/cui2016ecmlpkdd-copula/) doi:10.1007/978-3-319-46227-1_24

BibTeX

@inproceedings{cui2016ecmlpkdd-copula,
  title     = {{Copula PC Algorithm for Causal Discovery from Mixed Data}},
  author    = {Cui, Ruifei and Groot, Perry and Heskes, Tom},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2016},
  pages     = {377-392},
  doi       = {10.1007/978-3-319-46227-1_24},
  url       = {https://mlanthology.org/ecmlpkdd/2016/cui2016ecmlpkdd-copula/}
}