A Process-Oriented Heuristic for Model Selection
Abstract
Current methods to avoid overfitting are either data-oriented (using separate data for validation) or representation-oriented (penalizing complexity in the model). This paper proposes process-oriented evaluation, where a model's expected generalization error is computed as a function of the search process that led to it. The paper develops the necessary theoretical framework, and applies it to one type of learning: rule induction. A process-oriented version of the CN2 rule learner is empirically compared with the default CN2. The process-oriented version is more accurate in a large majority of the datasets, with high significance, and also produces simpler models. Experiments in artificial domains suggest that processoriented evaluation is particularly useful in high-dimensional domains. 1 INTRODUCTION Overfitting avoidance is often considered the central problem of machine learning (e.g., (Cheeseman & Oldford, 1994)). If a learner is sufficiently powerful, it must guard against selec...
Cite
Text
Domingos. "A Process-Oriented Heuristic for Model Selection." International Conference on Machine Learning, 1998.Markdown
[Domingos. "A Process-Oriented Heuristic for Model Selection." International Conference on Machine Learning, 1998.](https://mlanthology.org/icml/1998/domingos1998icml-process/)BibTeX
@inproceedings{domingos1998icml-process,
title = {{A Process-Oriented Heuristic for Model Selection}},
author = {Domingos, Pedro M.},
booktitle = {International Conference on Machine Learning},
year = {1998},
pages = {127-135},
url = {https://mlanthology.org/icml/1998/domingos1998icml-process/}
}