Neural Representational Geometry of Concepts in Large Language Models

Abstract

Despite tremendous successes of large language models (LLMs), their internal neural representations remain opaque. Here we characterize the geometric properties of language model representations and their impact on few-shot classification of concept categories. Our work builds on Sorscher et al. (2022)'s theory, previously used to study neural representations in the vision domain. We apply this theory to embeddings obtained at various layers of a pre-trained LLM. We mainly focus on LLaMa-3-8B, while also confirming their applicability to OpenAI's text-embedding-3-large. Our study reveals geometric properties and their variations across layers that are unique to language models, and provides insights into their implications for understanding concept representation in LLMs.

Cite

Text

Schrage et al. "Neural Representational Geometry of Concepts in Large Language Models." NeurIPS 2024 Workshops: NeurReps, 2024.

Markdown

[Schrage et al. "Neural Representational Geometry of Concepts in Large Language Models." NeurIPS 2024 Workshops: NeurReps, 2024.](https://mlanthology.org/neuripsw/2024/schrage2024neuripsw-neural/)

BibTeX

@inproceedings{schrage2024neuripsw-neural,
  title     = {{Neural Representational Geometry of Concepts in Large Language Models}},
  author    = {Schrage, Linden and Irie, Kazuki and Sompolinsky, Haim},
  booktitle = {NeurIPS 2024 Workshops: NeurReps},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/schrage2024neuripsw-neural/}
}