A Process-Oriented Heuristic for Model Selection

Abstract

Current methods to avoid overfitting are either data-oriented (using separate data for validation) or representation-oriented (penalizing complexity in the model). This paper proposes process-oriented evaluation, where a model's expected generalization error is computed as a function of the search process that led to it. The paper develops the necessary theoretical framework, and applies it to one type of learning: rule induction. A process-oriented version of the CN2 rule learner is empirically compared with the default CN2. The process-oriented version is more accurate in a large majority of the datasets, with high significance, and also produces simpler models. Experiments in artificial domains suggest that processoriented evaluation is particularly useful in high-dimensional domains. 1 INTRODUCTION Overfitting avoidance is often considered the central problem of machine learning (e.g., (Cheeseman & Oldford, 1994)). If a learner is sufficiently powerful, it must guard against selec...

Cite

Text

Domingos. "A Process-Oriented Heuristic for Model Selection." International Conference on Machine Learning, 1998.

Markdown

[Domingos. "A Process-Oriented Heuristic for Model Selection." International Conference on Machine Learning, 1998.](https://mlanthology.org/icml/1998/domingos1998icml-process/)

BibTeX

@inproceedings{domingos1998icml-process,
  title     = {{A Process-Oriented Heuristic for Model Selection}},
  author    = {Domingos, Pedro M.},
  booktitle = {International Conference on Machine Learning},
  year      = {1998},
  pages     = {127-135},
  url       = {https://mlanthology.org/icml/1998/domingos1998icml-process/}
}