Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language
Abstract
Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the ‘right bias’ to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons.
Cite
Text
Daelemans et al. "Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language." European Conference on Machine Learning, 2003. doi:10.1007/978-3-540-39857-8_10Markdown
[Daelemans et al. "Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language." European Conference on Machine Learning, 2003.](https://mlanthology.org/ecmlpkdd/2003/daelemans2003ecml-combined/) doi:10.1007/978-3-540-39857-8_10BibTeX
@inproceedings{daelemans2003ecml-combined,
title = {{Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language}},
author = {Daelemans, Walter and Hoste, Véronique and De Meulder, Fien and Naudts, Bart},
booktitle = {European Conference on Machine Learning},
year = {2003},
pages = {84-95},
doi = {10.1007/978-3-540-39857-8_10},
url = {https://mlanthology.org/ecmlpkdd/2003/daelemans2003ecml-combined/}
}