Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms

Omid Madani, David M. Pennock, Gary W. Flake

NeurIPS 2004 pp. 873-880

/neurips/2004/madani2004neurips-covalidation/

Abstract

In the context of binary classification, we define disagreement as a mea- sure of how often two independently-trained models differ in their clas- sification of unlabeled data. We explore the use of disagreement for error estimation and model selection. We call the procedure co-validation, since the two models effectively (in)validate one another by comparing results on unlabeled data, which we assume is relatively cheap and plen- tiful compared to labeled data. We show that per-instance disagreement is an unbiased estimate of the variance of error for that instance. We also show that disagreement provides a lower bound on the prediction (gen- eralization) error, and a tight upper bound on the "variance of prediction error", or the variance of the average error across instances, where vari- ance is measured across training sets. We present experimental results on several data sets exploring co-validation for error estimation and model selection. The procedure is especially effective in active learning set- tings, where training sets are not drawn at random and cross validation overestimates error.

PDF NeurIPS Semantic Scholar

Cite

Text

Madani et al. "Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms." Neural Information Processing Systems, 2004.

Markdown

[Madani et al. "Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms." Neural Information Processing Systems, 2004.](https://mlanthology.org/neurips/2004/madani2004neurips-covalidation/)

BibTeX

@inproceedings{madani2004neurips-covalidation,
  title     = {{Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms}},
  author    = {Madani, Omid and Pennock, David M. and Flake, Gary W.},
  booktitle = {Neural Information Processing Systems},
  year      = {2004},
  pages     = {873-880},
  url       = {https://mlanthology.org/neurips/2004/madani2004neurips-covalidation/}
}