Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in \mathbb Rd

Chang, Kevin L.

doi:10.1007/978-3-540-75225-7_19

Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in \mathbb Rd

Kevin L. Chang

ALT 2007 pp. 211-226

doi:10.1007/978-3-540-75225-7_19 /alt/2007/chang2007alt-multiple/

Abstract

We present a multiple pass streaming algorithm for learning the density function of a mixture of k uniform distributions over rectangles (cells) in ${\mathbb R}^d$ , for any d > 0. Our learning model is: samples drawn according to the mixture are placed in arbitrary order in a data stream that may only be accessed sequentially by an algorithm with a very limited random access memory space. Our algorithm makes 2ℓ + 1 passes, for any ℓ> 0, and requires memory at most $\tilde O(\epsilon^{-2/\ell}k^2d^4+(2k)^d)$ . This exhibits a strong memory-space tradeoff: a few more passes significantly lowers its memory requirements, thus trading one of the two most important resources in streaming computation for the other. Chang and Kannan ? first considered this problem for [1] d = 1, 2. Our learning algorithm is especially appropriate for situations where massive data sets of samples are available, but practical computation with such large inputs requires very restricted models of computation.

PDF ALT Semantic Scholar

Cite

Text

Chang. "Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in \mathbb Rd." International Conference on Algorithmic Learning Theory, 2007. doi:10.1007/978-3-540-75225-7_19

Markdown

[Chang. "Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in \mathbb Rd." International Conference on Algorithmic Learning Theory, 2007.](https://mlanthology.org/alt/2007/chang2007alt-multiple/) doi:10.1007/978-3-540-75225-7_19

BibTeX

@inproceedings{chang2007alt-multiple,
  title     = {{Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in \mathbb Rd}},
  author    = {Chang, Kevin L.},
  booktitle = {International Conference on Algorithmic Learning Theory},
  year      = {2007},
  pages     = {211-226},
  doi       = {10.1007/978-3-540-75225-7_19},
  url       = {https://mlanthology.org/alt/2007/chang2007alt-multiple/}
}