How Linearly Associative Are Memories in Large Language Models?
Abstract
Large Language Models (LLMs) exhibit remarkable capacities to store and retrieve factual knowledge, yet the precise mechanisms by which they encode and recall this information remain under debate. Two main frameworks have been proposed to explain memory storage within transformer feed-forward layers: (1) a key-value memory view, and (2) linear associative memories view. In this paper, we investigate the extent to which the second MLP matrix in LLMs behaves as a linear associative memory (LAM). By measuring pairwise angles between input activation vectors that represent key-vectors in the LAM model, we find that the second MLP matrix exhibits relatively higher orthogonality and minimal cross-talk, supporting the LAM interpretation for generic retrieval. However, we also discover that subject-token representations used in factual recall are significantly less orthogonal, indicating greater interference and entanglement. This implies that editing factual “memories” within these matrices may trigger unintended side effects in other related knowledge. Our results highlight both the promise and the pitfalls of viewing feed-forward layers as linear associative memories, underscoring the need for careful strategies when modifying factual representations in LLMs.
Cite
Text
Gupta et al. "How Linearly Associative Are Memories in Large Language Models?." ICLR 2025 Workshops: NFAM, 2025.Markdown
[Gupta et al. "How Linearly Associative Are Memories in Large Language Models?." ICLR 2025 Workshops: NFAM, 2025.](https://mlanthology.org/iclrw/2025/gupta2025iclrw-linearly/)BibTeX
@inproceedings{gupta2025iclrw-linearly,
title = {{How Linearly Associative Are Memories in Large Language Models?}},
author = {Gupta, Akshat and Sindhu, Nehal and Anumanchipalli, Gopala},
booktitle = {ICLR 2025 Workshops: NFAM},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/gupta2025iclrw-linearly/}
}