Efficient Reasoning Models: A Survey

Abstract

Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this “slow-thinking” paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter – compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller – developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster – designing efficient decoding strategies to accelerate inference of reasoning models. A curated collection of papers discussed in this survey is available in our GitHub repository: https://github.com/fscdc/Awesome-Efficient-Reasoning-Models.

Cite

Text

Feng et al. "Efficient Reasoning Models: A Survey." Transactions on Machine Learning Research, 2025.

Markdown

[Feng et al. "Efficient Reasoning Models: A Survey." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/feng2025tmlr-efficient/)

BibTeX

@article{feng2025tmlr-efficient,
  title     = {{Efficient Reasoning Models: A Survey}},
  author    = {Feng, Sicheng and Fang, Gongfan and Ma, Xinyin and Wang, Xinchao},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/feng2025tmlr-efficient/}
}