Sequential Information Bottleneck for Finite Data

Abstract

The sequential information bottleneck (sIB) algorithm clusters co-occurrence data such as text documents vs. words. We introduce a variant that models sparse co-occurrence data by a generative process. This turns the objective function of sIB, mutual information, into a Bayes factor, while keeping it intact asymptotically, for non-sparse data. Experimental performance of the new algorithm is comparable to the original sIB for large data sets, and better for smaller, sparse sets.

Cite

Text

Peltonen et al. "Sequential Information Bottleneck for Finite Data." International Conference on Machine Learning, 2004. doi:10.1145/1015330.1015375

Markdown

[Peltonen et al. "Sequential Information Bottleneck for Finite Data." International Conference on Machine Learning, 2004.](https://mlanthology.org/icml/2004/peltonen2004icml-sequential/) doi:10.1145/1015330.1015375

BibTeX

@inproceedings{peltonen2004icml-sequential,
  title     = {{Sequential Information Bottleneck for Finite Data}},
  author    = {Peltonen, Jaakko and Sinkkonen, Janne and Kaski, Samuel},
  booktitle = {International Conference on Machine Learning},
  year      = {2004},
  doi       = {10.1145/1015330.1015375},
  url       = {https://mlanthology.org/icml/2004/peltonen2004icml-sequential/}
}