Distilling System 2 into System 1

Ping Yu, Jing Xu, Jason E Weston, Ilia Kulikov

NeurIPSW 2024

/neuripsw/2024/yu2024neuripsw-distilling/

Abstract

Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought \citep{CoT}, many such {\em System 2} techniques have been proposed such as Rephrase and Respond \citep{RaR}, System 2 Attention \citep{S2A} and Branch-Solve-Merge \citep{BSM}. In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations {\em without} intermediate reasoning token sequences, as this reasoning has been distilled into {\em System 1}. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Yu et al. "Distilling System 2 into System 1." NeurIPS 2024 Workshops: Sys2-Reasoning, 2024.

Markdown

[Yu et al. "Distilling System 2 into System 1." NeurIPS 2024 Workshops: Sys2-Reasoning, 2024.](https://mlanthology.org/neuripsw/2024/yu2024neuripsw-distilling/)

BibTeX

@inproceedings{yu2024neuripsw-distilling,
  title     = {{Distilling System 2 into System 1}},
  author    = {Yu, Ping and Xu, Jing and Weston, Jason E and Kulikov, Ilia},
  booktitle = {NeurIPS 2024 Workshops: Sys2-Reasoning},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/yu2024neuripsw-distilling/}
}