A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery

Abstract

Testing for Conditional Independence (CI) is a fundamental task for causal discovery but is particularly challenging in mixed discrete-continuous data. In this context, inadequate assumptions or discretization of continuous variables reduce the CI test’s statistical power, which yields incorrect learned causal structures. In this work, we present a non-parametric CI test leveraging k-nearest neighbor (kNN) methods that are adaptive to mixed discrete-continuous data. In particular, a kNN-based conditional mutual information estimator serves as the test statistic, and the p-value is calculated using a kNN-based local permutation scheme. We prove the CI test’s statistical validity and power in mixed discrete-continuous data, which yields consistency when used in constraint-based causal discovery. An extensive evaluation of synthetic and real-world data shows that the proposed CI test outperforms state-of-the-art approaches in the accuracy of CI testing and causal discovery, particularly in settings with low sample sizes.

Cite

Text

Huegle et al. "A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43412-9_32

Markdown

[Huegle et al. "A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/huegle2023ecmlpkdd-knnbased/) doi:10.1007/978-3-031-43412-9_32

BibTeX

@inproceedings{huegle2023ecmlpkdd-knnbased,
  title     = {{A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery}},
  author    = {Huegle, Johannes and Hagedorn, Christopher and Schlosser, Rainer},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {541-558},
  doi       = {10.1007/978-3-031-43412-9_32},
  url       = {https://mlanthology.org/ecmlpkdd/2023/huegle2023ecmlpkdd-knnbased/}
}