MrRoPE: Mixed-Radix Rotary Position Embedding

Abstract

Rotary Position Embedding (RoPE)-extension refers to modifying or generalizing the Rotary Position Embedding scheme to handle longer sequences than those encountered during pre-training. However, current extension strategies are highly diverse and lack a unified theoretical foundation. In this paper, we propose $\textbf{\textit{MrRoPE (Mixed-radix RoPE)}}$, a generalized encoding formulation based on a radix system conversion perspective, which elegantly unifies various RoPE-extension approaches as distinct radix conversion strategies. Based on this theory, we introduce two training-free extensions, $\textbf{\textit{MrRoPE-Uni}}$ and $\textbf{\textit{MrRoPE-Pro}}$, which leverage uniform and progressive radix conversion strategies, respectively, to achieve “train short, test long” generalization. Without fine-tuning, MrRoPE-Pro sustains over 85% recall in the 128K-context Needle-in-a-Haystack test and achieves more than double YaRN’s accuracy on Infinite-Bench retrieval and dialogue subsets. Theoretical analysis confirms that MrRoPE-Pro effectively raises the upper bound of RoPE's attainable encoding length, which further validates the reliability and utility of our theory and methodology.

Cite

Text

Tian et al. "MrRoPE: Mixed-Radix Rotary Position Embedding." International Conference on Learning Representations, 2026.

Markdown

[Tian et al. "MrRoPE: Mixed-Radix Rotary Position Embedding." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/tian2026iclr-mrrope/)

BibTeX

@inproceedings{tian2026iclr-mrrope,
  title     = {{MrRoPE: Mixed-Radix Rotary Position Embedding}},
  author    = {Tian, Qingyuan and Zhu, Wenhong and Liu, Xiaoran and Wang, Xiaofeng and Wang, Rui},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/tian2026iclr-mrrope/}
}