How Transformers Reason: A Case Study on a Synthetic Propositional Logic Problem

Guan Zhe Hong, Nishanth Dikkala, Enming Luo, Cyrus Rashtchian, Xin Wang, Rina Panigrahy

NeurIPSW 2024

/neuripsw/2024/hong2024neuripsw-transformers/

Abstract

Large language models (LLMs) have demonstrated remarkable performance in tasks that require reasoning abilities. Motivated by recent works showing evidence of LLMs being able to plan and reason on abstract reasoning problems in context, we conduct a set of controlled experiments on a synthetic propositional logic problem to provide a mechanistic understanding of how such abilities arise. In particular, for a decoder-only Transformer trained solely on our synthetic dataset, we identify the specific mechanisms by which a three-layer Transformer solves the reasoning task. In particular, we identify certain "planning" and "reasoning" circuits which require cooperation between the attention blocks to in totality implement the desired reasoning algorithm. To expand our findings, we then study a larger model, Mistral 7B. Using activation patching, we characterize internal components that are critical in solving our logic problem. Overall, our work systemically uncovers novel aspects of small and large transformers, and continues the study of how they plan and reason.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Hong et al. "How Transformers Reason: A Case Study on a Synthetic Propositional Logic Problem." NeurIPS 2024 Workshops: MATH-AI, 2024.

Markdown

[Hong et al. "How Transformers Reason: A Case Study on a Synthetic Propositional Logic Problem." NeurIPS 2024 Workshops: MATH-AI, 2024.](https://mlanthology.org/neuripsw/2024/hong2024neuripsw-transformers/)

BibTeX

@inproceedings{hong2024neuripsw-transformers,
  title     = {{How Transformers Reason: A Case Study on a Synthetic Propositional Logic Problem}},
  author    = {Hong, Guan Zhe and Dikkala, Nishanth and Luo, Enming and Rashtchian, Cyrus and Wang, Xin and Panigrahy, Rina},
  booktitle = {NeurIPS 2024 Workshops: MATH-AI},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/hong2024neuripsw-transformers/}
}