Memory Mosaics

Abstract

Memory Mosaics are networks of associative memories working in concert to achieve a prediction task of interest. Like transformers, memory mosaics possess compositional capabilities and in-context learning capabilities. Unlike transformers, memory mosaics achieve these capabilities in comparatively transparent way (“predictive disentanglement”). We illustrate these capabilities on a toy example and also show that memory mosaics perform as well or better than transformers on medium-scale language modeling tasks.

Cite

Text

Zhang et al. "Memory Mosaics." International Conference on Learning Representations, 2025.

Markdown

[Zhang et al. "Memory Mosaics." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/zhang2025iclr-memory/)

BibTeX

@inproceedings{zhang2025iclr-memory,
  title     = {{Memory Mosaics}},
  author    = {Zhang, Jianyu and Nolte, Niklas and Sadhukhan, Ranajoy and Chen, Beidi and Bottou, Leon},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/zhang2025iclr-memory/}
}