Correcting Bias in Statistical Tests for Network Classifier Evaluation

Abstract

It is difficult to directly apply conventional significance tests to compare the performance of network classification models because network data instances are not independent and identically distributed. Recent work [6] has shown that paired t -tests applied to overlapping network samples will result in unacceptably high levels (e.g., up to 50%) of Type I error (i.e., the tests lead to incorrect conclusions that models are different, when they are not). Thus, we need new strategies to accurately evaluate network classifiers. In this paper, we analyze the sources of bias (e.g. dependencies among network data instances) theoretically and propose analytical corrections to standard significance tests to reduce the Type I error rate to more acceptable levels, while maintaining reasonable levels of statistical power to detect true performance differences. We validate the effectiveness of the proposed corrections empirically on both synthetic and real networks.

Cite

Text

Wang et al. "Correcting Bias in Statistical Tests for Network Classifier Evaluation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011. doi:10.1007/978-3-642-23808-6_33

Markdown

[Wang et al. "Correcting Bias in Statistical Tests for Network Classifier Evaluation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011.](https://mlanthology.org/ecmlpkdd/2011/wang2011ecmlpkdd-correcting/) doi:10.1007/978-3-642-23808-6_33

BibTeX

@inproceedings{wang2011ecmlpkdd-correcting,
  title     = {{Correcting Bias in Statistical Tests for Network Classifier Evaluation}},
  author    = {Wang, Tao and Neville, Jennifer and Gallagher, Brian and Eliassi-Rad, Tina},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2011},
  pages     = {506-521},
  doi       = {10.1007/978-3-642-23808-6_33},
  url       = {https://mlanthology.org/ecmlpkdd/2011/wang2011ecmlpkdd-correcting/}
}