Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy
Abstract
In-context reasoning methods, exemplified by Chain-of-Thought (CoT) (et alia.,) empower the reasoning abilities of large language models (LLMs), eliciting them to solve complex reasoning tasks step-by-step. Nevertheless, the capacities to deliver robust CoT explanations arise only in models with billions of parameters, representing a barrier to entry for many users forced to operate on a smaller model scale, i.e., Small Language Models (SLMs). Even though many companies are releasing LLMs of the same family with a reduced number of parameters, these models sometimes produce misleading answers and are unable to deliver accurate step-wise reasoned answers. This paper proposes a method to transfer step-wise reasoning over SLMs by operating via Instruction-tuning (IT) on synthetic demonstrations delivered in a pedagogically motivated manner. In particular, firstly, we propose aligning step-wise reasoning capabilities via IT using Demonstrations "taught" by LLMs teacher to SLMs students. Then, we operate via Curriculum Learning, a pedagogically motivated learning method that improves the IT phase. We analyse the impact on the downstream performances of four question-answering benchmarks. The results show that SMLs can be instructed to reason via Demonstrations delivered by LLMs. We move a step further in research: conceiving SLMs as human learners, we expose them to a CL teaching-based approach, obtaining better results on downstream performances.
Cite
Text
Ranaldi et al. "Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy." Transactions on Machine Learning Research, 2025.Markdown
[Ranaldi et al. "Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/ranaldi2025tmlr-transferring/)BibTeX
@article{ranaldi2025tmlr-transferring,
title = {{Transferring Reasoning Capabilities Between LLMs Operating via Curriculum Learning Policy}},
author = {Ranaldi, Leonardo and Pucci, Giulia and Zanzotto, Fabio Massimo},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/ranaldi2025tmlr-transferring/}
}