A Kernel Statistical Test of Independence
Abstract
Although kernel measures of independence have been widely applied in machine learning (notably in kernel ICA), there is as yet no method to determine whether they have detected statistically significant dependence. We provide a novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC). The resulting test costs O(m2), where m is the sample size. We demonstrate that this test outperforms established contingency table and functional correlation-based tests, and that this advantage is greater for multivariate data. Finally, we show the HSIC test also applies to text (and to structured data more generally), for which no other independence test presently exists.
Cite
Text
Gretton et al. "A Kernel Statistical Test of Independence." Neural Information Processing Systems, 2007.Markdown
[Gretton et al. "A Kernel Statistical Test of Independence." Neural Information Processing Systems, 2007.](https://mlanthology.org/neurips/2007/gretton2007neurips-kernel/)BibTeX
@inproceedings{gretton2007neurips-kernel,
title = {{A Kernel Statistical Test of Independence}},
author = {Gretton, Arthur and Fukumizu, Kenji and Teo, Choon H. and Song, Le and Schölkopf, Bernhard and Smola, Alex J.},
booktitle = {Neural Information Processing Systems},
year = {2007},
pages = {585-592},
url = {https://mlanthology.org/neurips/2007/gretton2007neurips-kernel/}
}