Scaling up: Distributed Machine Learning with Cooperation

Abstract

Machine-learning methods are becoming increasingly popular for automated data analysis. However, standard methods do not scale up to massive scientific and business data sets without expensive hardware. This paper investigates a practical alternative for scaling up: the use of distributed processing to take advantage of the often dormant PCs and workstations available on local networks. Each workstation runs a common rule-learning program on a subset of the data. We first show that for commonly used ruleevaluation criteria, a simple form of cooperation can guarantee that a rule will look good to the set of cooperating learners if and only if it would look good to a single learner operating with the entire data set. We then show how such a system can further capitalize on different perspectives by sharing learned knowledge for significant reduction in search effort. We demonstrate the power of the method by learning from a massive data set taken from the domain of cel...

Cite

Text

Provost and Hennessy. "Scaling up: Distributed Machine Learning with Cooperation." AAAI Conference on Artificial Intelligence, 1996.

Markdown

[Provost and Hennessy. "Scaling up: Distributed Machine Learning with Cooperation." AAAI Conference on Artificial Intelligence, 1996.](https://mlanthology.org/aaai/1996/provost1996aaai-scaling/)

BibTeX

@inproceedings{provost1996aaai-scaling,
  title     = {{Scaling up: Distributed Machine Learning with Cooperation}},
  author    = {Provost, Foster J. and Hennessy, Daniel N.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1996},
  pages     = {74-79},
  url       = {https://mlanthology.org/aaai/1996/provost1996aaai-scaling/}
}