In-Context Learning Dynamics with Random Binary Sequences
Abstract
Large language models (LLMs) trained on huge text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for. The precise nature of LLM capabilities is often mysterious, and different prompts can elicit different capabilities through in-context learning. We propose a framework that enables us to analyze in-context learning dynamics to understand latent concepts underlying LLMs’ behavioral patterns. This provides a more nuanced understanding than success-or-failure evaluation benchmarks, but does not require observing internal activations as a mechanistic interpretation of circuits would. Inspired by the cognitive science of human randomness perception, we use random binary sequences as context and study dynamics of in-context learning by manipulating properties of context data, such as sequence length. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages, with striking in-context learning dynamics where model outputs transition sharply from seemingly random behaviors to deterministic repetition.
Cite
Text
Bigelow et al. "In-Context Learning Dynamics with Random Binary Sequences." International Conference on Learning Representations, 2024.Markdown
[Bigelow et al. "In-Context Learning Dynamics with Random Binary Sequences." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/bigelow2024iclr-incontext/)BibTeX
@inproceedings{bigelow2024iclr-incontext,
title = {{In-Context Learning Dynamics with Random Binary Sequences}},
author = {Bigelow, Eric J and Lubana, Ekdeep Singh and Dick, Robert P. and Tanaka, Hidenori and Ullman, Tomer},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/bigelow2024iclr-incontext/}
}