Uncovering Uncertainty in Transformer Inference

Abstract

We explore the Iterative Inference Hypothesis (IIH) within the context of transformer-based language models, aiming to understand how a model's latent representations are progressively refined and whether observable differences are present between correct and incorrect generations. Our findings provide empirical support for the IIH, showing that the n-th token embedding in the residual stream follows a trajectory of decreasing loss. Additionally, we observe that the rate at which residual embeddings converge to a stable output representation reflects uncertainty in the token generation process. Finally, we introduce a method utilizing cross-entropy to detect this uncertainty and demonstrate its potential to distinguish between correct and incorrect token generations on a dataset of idioms.

Cite

Text

Brothers et al. "Uncovering Uncertainty in Transformer Inference." NeurIPS 2024 Workshops: MINT, 2024.

Markdown

[Brothers et al. "Uncovering Uncertainty in Transformer Inference." NeurIPS 2024 Workshops: MINT, 2024.](https://mlanthology.org/neuripsw/2024/brothers2024neuripsw-uncovering/)

BibTeX

@inproceedings{brothers2024neuripsw-uncovering,
  title     = {{Uncovering Uncertainty in Transformer Inference}},
  author    = {Brothers, Greyson and Mannering, Willa M. and Winder, John and Tien, Amber},
  booktitle = {NeurIPS 2024 Workshops: MINT},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/brothers2024neuripsw-uncovering/}
}