CD-POS: Long Context Generalization in LLMs Through Continuous and Discrete Position Synthesis
Abstract
Large language models are critical for natural language processing and multi-modal tasks, but face challenges in tasks requiring long context windows due to computational and memory limitations. Existing methods to extend these windows are resource intensive. The proposed Continuous and Discrete Position Synthesis (CD-Pos) addresses these issues by using synthesized position indices to expand context windows efficiently. CD-Pos divides sequences into segments with continuous indices, enhancing token distance and preserving local information. Empirical evaluations show that CD-Pos effectively extends context windows up to 128k while maintaining LLMs' performance in general tasks.
Cite
Text
Hu et al. "CD-POS: Long Context Generalization in LLMs Through Continuous and Discrete Position Synthesis." ICML 2024 Workshops: LCFM, 2024.Markdown
[Hu et al. "CD-POS: Long Context Generalization in LLMs Through Continuous and Discrete Position Synthesis." ICML 2024 Workshops: LCFM, 2024.](https://mlanthology.org/icmlw/2024/hu2024icmlw-cdpos/)BibTeX
@inproceedings{hu2024icmlw-cdpos,
title = {{CD-POS: Long Context Generalization in LLMs Through Continuous and Discrete Position Synthesis}},
author = {Hu, Zhiyuan and Liu, Yuliang and Zhao, Jinman and Wang, Suyuchen and WangYan, and Shen, Wei and Yin, Chao and Hooi, Bryan},
booktitle = {ICML 2024 Workshops: LCFM},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/hu2024icmlw-cdpos/}
}