FlowNIB: An Information Bottleneck Analysis of Bidirectional vs. Unidirectional Language Models

Abstract

Bidirectional language models (LMs) consistently show stronger context understanding than unidirectional models, yet the theoretical reason remains unclear. We present a simple information bottleneck (IB) perspective: bidirectional representations preserve more mutual information (MI) about both the input and the target, yielding richer features for downstream tasks. We adopt a layer–wise view and hypothesize that, at comparable capacity, bidirectional layers retain more useful signal than unidirectional ones. To test this claim empirically, we present Flow Neural Information Bottleneck (FlowNIB), a lightweight, post-hoc framework capable of estimating comparable mutual information values for individual layers in LMs, quantifying how much mutual information each layer carries about the input and target. FlowNIB takes three inputs—(i) the original LM’s inputs/dataset, (ii) ground–truth labels, and (iii) layer activations—simultaneously estimates the mutual information for both the input–layer and layer–label pairs. Empirically, bidirectional LM layers exhibit higher mutual information than similar—and even larger—unidirectional LMs. As a result, bidirectional LMs outperform unidirectional LMs across extensive experiments on NLU benchmarks (e.g., GLUE), commonsense reasoning, and regression tasks, demonstrating superior context understanding.

Cite

Text

Kowsher et al. "FlowNIB: An Information Bottleneck Analysis of Bidirectional vs. Unidirectional Language Models." International Conference on Learning Representations, 2026.

Markdown

[Kowsher et al. "FlowNIB: An Information Bottleneck Analysis of Bidirectional vs. Unidirectional Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/kowsher2026iclr-flownib/)

BibTeX

@inproceedings{kowsher2026iclr-flownib,
  title     = {{FlowNIB: An Information Bottleneck Analysis of Bidirectional vs. Unidirectional Language Models}},
  author    = {Kowsher, Md and Prottasha, Nusrat Jahan and Xu, Shiyun and Mohanto, Shetu and Yousefi, Niloofar and Garibay, Ozlem and Chen, Chen},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/kowsher2026iclr-flownib/}
}