Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy

Abstract

In-context reasoning methods, exemplified by Chain-of-Thought (CoT) (et alia.,) empower the reasoning abilities of large language models (LLMs), eliciting them to solve complex reasoning tasks step-by-step. Nevertheless, the capacities to deliver robust CoT explanations arise only in models with billions of parameters, representing a barrier to entry for many users forced to operate on a smaller model scale, i.e., Small Language Models (SLMs). Even though many companies are releasing LLMs of the same family with a reduced number of parameters, these models sometimes produce misleading answers and are unable to deliver accurate step-wise reasoned answers. This paper proposes a method to transfer step-wise reasoning over SLMs by operating via Instruction-tuning (IT) on synthetic demonstrations delivered in a pedagogically motivated manner. In particular, firstly, we propose aligning step-wise reasoning capabilities via IT using Demonstrations "taught" by LLMs teacher to SLMs students. Then, we operate via Curriculum Learning, a pedagogically motivated learning method that improves the IT phase. We analyse the impact on the downstream performances of four question-answering benchmarks. The results show that SMLs can be instructed to reason via Demonstrations delivered by LLMs. We move a step further in research: conceiving SLMs as human learners, we expose them to a CL teaching-based approach, obtaining better results on downstream performances.

Cite

Text

Ranaldi et al. "Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy." Transactions on Machine Learning Research, 2025.

Markdown

[Ranaldi et al. "Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/ranaldi2025tmlr-transferring/)

BibTeX

@article{ranaldi2025tmlr-transferring,
  title     = {{Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy}},
  author    = {Ranaldi, Leonardo and Pucci, Giulia and Zanzotto, Fabio Massimo},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/ranaldi2025tmlr-transferring/}
}