Proper Model Selection with Significance Test
Abstract
Model selection is an important and ubiquitous task in machine learning. To select models with the best future classification performance measured by a goal metric , an evaluation metric is often used to select the best classification model among the competing ones. A common practice is to use the same goal and evaluation metric. However, in several recent studies, it is claimed that using an evaluation metric (such as AUC) other than the goal metric (such as accuracy) results in better selection of the correct models. In this paper, we point out a flaw in the experimental design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct classification models.
Cite
Text
Huang et al. "Proper Model Selection with Significance Test." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87479-9_53Markdown
[Huang et al. "Proper Model Selection with Significance Test." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/huang2008ecmlpkdd-proper/) doi:10.1007/978-3-540-87479-9_53BibTeX
@inproceedings{huang2008ecmlpkdd-proper,
title = {{Proper Model Selection with Significance Test}},
author = {Huang, Jin and Ling, Charles X. and Zhang, Harry and Matwin, Stan},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2008},
pages = {536-547},
doi = {10.1007/978-3-540-87479-9_53},
url = {https://mlanthology.org/ecmlpkdd/2008/huang2008ecmlpkdd-proper/}
}