On the Maximum Mutual Information Capacity of Neural Architectures

Foggo, Brandon; Yu, Nanpeng

On the Maximum Mutual Information Capacity of Neural Architectures

ICMLW 2023

/icmlw/2023/foggo2023icmlw-maximum/

Abstract

We derive the closed-form expression of the maximum mutual information - the maximum value of $I(X;Z)$ obtainable via training - for a broad family of neural network architectures. The quantity is essential to several branches of machine learning theory and practice. Quantitatively, we show that the maximum mutual information for these families all stem from generalizations of a single catch-all formula. Qualitatively, we show that the maximum mutual information of an architecture is most strongly influenced by the width of the smallest layer of the network - the ``information bottleneck'' in a different sense of the phrase, and by any statistical invariances captured by the architecture.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Foggo and Yu. "On the Maximum Mutual Information Capacity of Neural Architectures." ICML 2023 Workshops: NCW, 2023.

Markdown

[Foggo and Yu. "On the Maximum Mutual Information Capacity of Neural Architectures." ICML 2023 Workshops: NCW, 2023.](https://mlanthology.org/icmlw/2023/foggo2023icmlw-maximum/)

BibTeX

@inproceedings{foggo2023icmlw-maximum,
  title     = {{On the Maximum Mutual Information Capacity of Neural Architectures}},
  author    = {Foggo, Brandon and Yu, Nanpeng},
  booktitle = {ICML 2023 Workshops: NCW},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/foggo2023icmlw-maximum/}
}