The Information Bottleneck Method
Abstract
We define the relevant information in a signal $x \in X$ as being the information that this signal provides about another signal $y \in Y$. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal $x$ requires more than just predicting $y$, it also requires specifying which features of $X$ play a role in the prediction. We formalize this problem as that of finding a short code for $X$ that preserves the maximum information about $Y$.
Cite
Text
Tishby et al. "The Information Bottleneck Method." Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing, 1999.Markdown
[Tishby et al. "The Information Bottleneck Method." Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing, 1999.](https://mlanthology.org/misc/1999/tishby1999misc-information/)BibTeX
@misc{tishby1999misc-information,
title = {{The Information Bottleneck Method}},
author = {Tishby, Naftali and Pereira, Fernando C. and Bialek, William},
howpublished = {Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing},
year = {1999},
pages = {368-377},
url = {https://mlanthology.org/misc/1999/tishby1999misc-information/}
}