Exploring Monotonicity in Early-Exiting Language Models

Laitenberger, Filipe; Belitsky, Max; Sheremet, Denys

Exploring Monotonicity in Early-Exiting Language Models

Filipe Laitenberger, Max Belitsky, Denys Sheremet

ICMLW 2024

/icmlw/2024/laitenberger2024icmlw-exploring/

Abstract

Large Language Models (LLMs) have shown impressive results across the board, but inference can be costly. A promising solution is posed by early exiting methods that assume that not all tokens need the same amount of computation, exiting the LLM at earlier layers. Several early exiting methods have been proposed, which rely on the implicit assumption that as the network does more computation, it will become more confident in its prediction. We investigate this assumption for two early exiting methods and propose three new confidence measures for early exiting based on the insights. We find early evidence for monotonicity benefitting the quality of token generation.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Laitenberger et al. "Exploring Monotonicity in Early-Exiting Language Models." ICML 2024 Workshops: ES-FoMo-II, 2024.

Markdown

[Laitenberger et al. "Exploring Monotonicity in Early-Exiting Language Models." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/laitenberger2024icmlw-exploring/)

BibTeX

@inproceedings{laitenberger2024icmlw-exploring,
  title     = {{Exploring Monotonicity in Early-Exiting Language Models}},
  author    = {Laitenberger, Filipe and Belitsky, Max and Sheremet, Denys},
  booktitle = {ICML 2024 Workshops: ES-FoMo-II},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/laitenberger2024icmlw-exploring/}
}