Iteration Head: A Mechanistic Study of Chain-of-Thought

Abstract

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power.However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains limited.This paper helps fill this gap by demonstrating how CoT reasoning emerges in transformers in a controlled and interpretable setting.In particular, we observe the appearance of a specialized attention mechanism dedicated to iterative reasoning, which we coined "iteration heads".We track both the emergence and the precise working of these iteration heads down to the attention level, and measure the transferability of the CoT skills to which they give rise between tasks.

Cite

Text

Cabannes et al. "Iteration Head: A Mechanistic Study of Chain-of-Thought." Neural Information Processing Systems, 2024. doi:10.52202/079017-3463

Markdown

[Cabannes et al. "Iteration Head: A Mechanistic Study of Chain-of-Thought." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/cabannes2024neurips-iteration/) doi:10.52202/079017-3463

BibTeX

@inproceedings{cabannes2024neurips-iteration,
  title     = {{Iteration Head: A Mechanistic Study of Chain-of-Thought}},
  author    = {Cabannes, Vivien and Arnal, Charles and Bouaziz, Wassim and Yang, Alice and Charton, Francois and Kempe, Julia},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3463},
  url       = {https://mlanthology.org/neurips/2024/cabannes2024neurips-iteration/}
}