Sequential Information Bottleneck for Finite Data
Abstract
The sequential information bottleneck (sIB) algorithm clusters co-occurrence data such as text documents vs. words. We introduce a variant that models sparse co-occurrence data by a generative process. This turns the objective function of sIB, mutual information, into a Bayes factor, while keeping it intact asymptotically, for non-sparse data. Experimental performance of the new algorithm is comparable to the original sIB for large data sets, and better for smaller, sparse sets.
Cite
Text
Peltonen et al. "Sequential Information Bottleneck for Finite Data." International Conference on Machine Learning, 2004. doi:10.1145/1015330.1015375Markdown
[Peltonen et al. "Sequential Information Bottleneck for Finite Data." International Conference on Machine Learning, 2004.](https://mlanthology.org/icml/2004/peltonen2004icml-sequential/) doi:10.1145/1015330.1015375BibTeX
@inproceedings{peltonen2004icml-sequential,
title = {{Sequential Information Bottleneck for Finite Data}},
author = {Peltonen, Jaakko and Sinkkonen, Janne and Kaski, Samuel},
booktitle = {International Conference on Machine Learning},
year = {2004},
doi = {10.1145/1015330.1015375},
url = {https://mlanthology.org/icml/2004/peltonen2004icml-sequential/}
}