Learning Task Representations from In-Context Learning

Abstract

Large language models (LLMs) excel in in-context learning (ICL), adapting to new tasks via example-based prompts without parameter updates. Despite their capabilities, the internal representation and generalization of ICL tasks remain elusive. We introduce a method that encodes task information in ICL prompts by computing a single vector embedding as a weighted sum of the transformer's attention heads, optimized via gradient descent to address performance challenges. Our results indicate that current methods fail to generalize numeric tasks beyond trained lengths, exhibiting significant degradation with even minimal exceedance. Our approach not only addresses these shortcomings but also enhances performance across numeric and linguistic tasks, maintaining high task fidelity. This demonstrates our method's efficacy in deriving task-specific information from in-context demonstrations, suggesting broader applications for LLMs in ICL.

Cite

Text

Saglam et al. "Learning Task Representations from In-Context Learning." ICML 2024 Workshops: ICL, 2024.

Markdown

[Saglam et al. "Learning Task Representations from In-Context Learning." ICML 2024 Workshops: ICL, 2024.](https://mlanthology.org/icmlw/2024/saglam2024icmlw-learning/)

BibTeX

@inproceedings{saglam2024icmlw-learning,
  title     = {{Learning Task Representations from In-Context Learning}},
  author    = {Saglam, Baturay and Yang, Zhuoran and Kalogerias, Dionysis and Karbasi, Amin},
  booktitle = {ICML 2024 Workshops: ICL},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/saglam2024icmlw-learning/}
}