Self-Consistency Improves Chain of Thought Reasoning in Language Models

Abstract

Chain-of-thought prompting combined with pretrained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out all possible reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).

Cite

Text

Wang et al. "Self-Consistency Improves Chain of Thought Reasoning in Language Models." International Conference on Learning Representations, 2023.

Markdown

[Wang et al. "Self-Consistency Improves Chain of Thought Reasoning in Language Models." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/wang2023iclr-selfconsistency/)

BibTeX

@inproceedings{wang2023iclr-selfconsistency,
  title     = {{Self-Consistency Improves Chain of Thought Reasoning in Language Models}},
  author    = {Wang, Xuezhi and Wei, Jason and Schuurmans, Dale and Le, Quoc V and Chi, Ed H. and Narang, Sharan and Chowdhery, Aakanksha and Zhou, Denny},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/wang2023iclr-selfconsistency/}
}