Reproducibility Study of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"

Wiegman, Tijs; Perotti, Leyla; Pravdová, Viktória; Brand, Ori; Heuss, Maria

Reproducibility Study of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"

Tijs Wiegman, Leyla Perotti, Viktória Pravdová, Ori Brand, Maria Heuss

TMLR 2025

/tmlr/2025/wiegman2025tmlr-reproducibility/

Abstract

This paper presents a reproducibility study of Ortu et al. (2024), investigating the competition of the factual recall and counterfactual in-context adaptation mechanisms in GPT-2. We extend experiments developed by the original authors with softmax-normalized logits as another metric for gauging the evolution of the scoring of tokens in the model. Our reproduced and extended experiments validate the original paper's main claims regarding the location of the competition of mechanisms in GPT-2, i.e. that the competition emerges predominantly in later layers, and is driven by the attention blocks corresponding to a subset of specialized attention heads. Additionally, we explore intervention strategies based on attention modification to increase factual accuracy. We find that boosting multiple attention heads involved in factual recall simultaneously can have a synergistic effect on factual accuracy, which is further enhanced by the suppression of copy heads. Finally, we rework how the competition of mechanisms is conceptualized and find that the specialized factual recall heads identified by Ortu et al. (2024) act as copy regulators, penalizing counterfactual in-context adaptation and rewarding the copying of factual information.

PDF TMLR Semantic Scholar

Cite

Text

Wiegman et al. "Reproducibility Study of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"." Transactions on Machine Learning Research, 2025.

Markdown

[Wiegman et al. "Reproducibility Study of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/wiegman2025tmlr-reproducibility/)

BibTeX

@article{wiegman2025tmlr-reproducibility,
  title     = {{Reproducibility Study of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"}},
  author    = {Wiegman, Tijs and Perotti, Leyla and Pravdová, Viktória and Brand, Ori and Heuss, Maria},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/wiegman2025tmlr-reproducibility/}
}