Model Combination in the Multiple-Data-Batches Scenario
Abstract
The approach of combining models learned from multiple batches of data provide an alternative to the common practice of learning one model from all the available data (i.e., the data combination approach). This paper empirically examines the base-line behaviour of the model combination approach in this multiple-data-batches scenario. We find that model combination can lead to better performance even if the disjoint batches of data are drawn randomly from a larger sample, and relate the relative performance of the two approaches to the learning curve of the classifier used. The practical implication of our results is that one should consider using model combination rather than data combination, especially when multiple batches of data for the same task are readily available. Another interesting result is that we empirically show that the near-asymptotic performance of a single model, in some classification task, can be significantly improved by combining multiple models (derived from the same algorithm) if the constituent models are substantially different and there is some regularity in the models to be exploited by the combination method used. Comparisons with known theoretical results are also provided.
Cite
Text
Ting and Low. "Model Combination in the Multiple-Data-Batches Scenario." European Conference on Machine Learning, 1997. doi:10.1007/3-540-62858-4_90Markdown
[Ting and Low. "Model Combination in the Multiple-Data-Batches Scenario." European Conference on Machine Learning, 1997.](https://mlanthology.org/ecmlpkdd/1997/ting1997ecml-model/) doi:10.1007/3-540-62858-4_90BibTeX
@inproceedings{ting1997ecml-model,
title = {{Model Combination in the Multiple-Data-Batches Scenario}},
author = {Ting, Kai Ming and Low, Boon Toh},
booktitle = {European Conference on Machine Learning},
year = {1997},
pages = {250-265},
doi = {10.1007/3-540-62858-4_90},
url = {https://mlanthology.org/ecmlpkdd/1997/ting1997ecml-model/}
}