Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language

Abstract

Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the ‘right bias’ to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons.

Cite

Text

Daelemans et al. "Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language." European Conference on Machine Learning, 2003. doi:10.1007/978-3-540-39857-8_10

Markdown

[Daelemans et al. "Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language." European Conference on Machine Learning, 2003.](https://mlanthology.org/ecmlpkdd/2003/daelemans2003ecml-combined/) doi:10.1007/978-3-540-39857-8_10

BibTeX

@inproceedings{daelemans2003ecml-combined,
  title     = {{Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language}},
  author    = {Daelemans, Walter and Hoste, Véronique and De Meulder, Fien and Naudts, Bart},
  booktitle = {European Conference on Machine Learning},
  year      = {2003},
  pages     = {84-95},
  doi       = {10.1007/978-3-540-39857-8_10},
  url       = {https://mlanthology.org/ecmlpkdd/2003/daelemans2003ecml-combined/}
}