Transformers Are Minimax Optimal Nonparametric In-Context Learners

Kim, Juno; Nakamaki, Tai; Suzuki, Taiji

Transformers Are Minimax Optimal Nonparametric In-Context Learners

ICMLW 2024

/icmlw/2024/kim2024icmlw-transformers-a/

Abstract

In-context learning (ICL) of large language models has proven to be a surprisingly effective method of learning a new task from only a few demonstrative examples. In this paper, we shed light on the efficacy of ICL from the viewpoint of statistical learning theory. We develop approximation and generalization error analyses for a transformer model composed of a deep neural network and one linear attention layer, pretrained on nonparametric regression tasks sampled from general function spaces including the Besov space and piecewise $\gamma$-smooth class. In particular, we show that sufficiently trained transformers can achieve -- and even improve upon -- the minimax optimal estimation risk in context by encoding the most relevant basis representations during pretraining. Our analysis extends to high-dimensional or sequential data and distinguishes the \emph{pretraining} and \emph{in-context} generalization gaps, establishing upper and lower bounds w.r.t. both the number of tasks and in-context examples. These findings shed light on the effectiveness of few-shot prompting and the roles of task diversity and representation learning for ICL.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Kim et al. "Transformers Are Minimax Optimal Nonparametric In-Context Learners." ICML 2024 Workshops: ICL, 2024.

Markdown

[Kim et al. "Transformers Are Minimax Optimal Nonparametric In-Context Learners." ICML 2024 Workshops: ICL, 2024.](https://mlanthology.org/icmlw/2024/kim2024icmlw-transformers-a/)

BibTeX

@inproceedings{kim2024icmlw-transformers-a,
  title     = {{Transformers Are Minimax Optimal Nonparametric In-Context Learners}},
  author    = {Kim, Juno and Nakamaki, Tai and Suzuki, Taiji},
  booktitle = {ICML 2024 Workshops: ICL},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/kim2024icmlw-transformers-a/}
}