How Do Position Encodings Affect Length Generalization? Case Studies on In-Context Function Learning
Abstract
The capability of In-Context Learning (ICL) is crucial for large language models to generalize across a wide range of tasks. By utilizing prompts, these models can accurately predict outcomes for previously unseen tasks without necessitating retraining. However, this generalization ability does not extend to the length of the inputs; the effectiveness of ICL likely diminishes with excessively long inputs, resulting in errors in the generated text. To investigate this issue, we propose a study using a dataset of In-Context functions to understand the operational mechanisms of Transformer models in ICL and length generalization. We generated data using regression and Boolean functions and employed meta-learning techniques to endow the model with ICL capabilities. Our experimental results indicate that position encodings can significantly mitigate length generalization issues, with the most effective encoding extending the maximum input length to over eight times that of the original training length. However, further analysis revealed that while position encoding enhances length generalization, it compromises the model's inherent capabilities, such as its ability to generalize across different data types. Overall, our research illustrates that position encodings have a pronounced positive effect on length generalization, though it necessitates a careful trade-off with data generalization performance.
Cite
Text
Lin et al. "How Do Position Encodings Affect Length Generalization? Case Studies on In-Context Function Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I23.34637Markdown
[Lin et al. "How Do Position Encodings Affect Length Generalization? Case Studies on In-Context Function Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/lin2025aaai-position/) doi:10.1609/AAAI.V39I23.34637BibTeX
@inproceedings{lin2025aaai-position,
title = {{How Do Position Encodings Affect Length Generalization? Case Studies on In-Context Function Learning}},
author = {Lin, Di-Nan and Yao, Jui-Feng and Wu, Kun-Da and Xu, Hao and Huang, Chen-Hsi and Kao, Hung-Yu},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {24576-24584},
doi = {10.1609/AAAI.V39I23.34637},
url = {https://mlanthology.org/aaai/2025/lin2025aaai-position/}
}