Generalizing Reasoning Problems to Longer Lengths
Abstract
Length generalization (LG) is a challenging problem in learning to reason. It refers to the phenomenon that when trained on reasoning problems of smaller lengths/sizes, the model struggles with problems of larger sizes or lengths. Although it has been proven that reasoning can be learned if the intermediate reasoning steps (also known as chain-of-thought (CoT)) are given in the training data, existing studies only apply to within a given length (interpolation), while LG is about extrapolation beyond the given length. This paper begins by presenting a theorem that identifies the root cause of the LG problem. It then defines a class of reasoning problems for which achieving LG with Transformers can be theoretically guaranteed, provided the CoT schemes are constructed to meet a proposed condition called $(n,r)$-consistency.
Cite
Text
Xiao and Liu. "Generalizing Reasoning Problems to Longer Lengths." International Conference on Learning Representations, 2025.Markdown
[Xiao and Liu. "Generalizing Reasoning Problems to Longer Lengths." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/xiao2025iclr-generalizing/)BibTeX
@inproceedings{xiao2025iclr-generalizing,
title = {{Generalizing Reasoning Problems to Longer Lengths}},
author = {Xiao, Changnan and Liu, Bing},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/xiao2025iclr-generalizing/}
}