MLP Memory: A Retriever-Pretrained Memory for Large Language Models

Abstract

Modern approaches to enhancing Large Language Models' factual accuracy and knowledge utilization face a fundamental trade-off: non-parametric retrieval-augmented generation (RAG) provides flexible access to external knowledge but suffers from high inference latency and shallow integration, while parametric fine-tuning methods like LoRA risk catastrophic forgetting and degraded general capabilities. In this work, we propose MLP Memory, a lightweight parametric module that learns to internalize retrieval patterns without explicit document access. By pretraining an MLP to imitate a $k$NN retriever's behavior on the entire pretraining dataset, we create a differentiable memory component that captures the benefits of retrieval-based knowledge access in a fully parametric form. Our architecture integrates this pretrained MLP Memory with Transformer decoders through simple probability interpolation, achieving 12.3\% relative improvement on five question-answering benchmarks and 5.2 points absolute gain across nine general NLP tasks, while reducing hallucinations by up to 10 points on HaluEval. Moreover, MLP Memory delivers 2.5$\times$ faster inference than RAG with superior accuracy. Our findings show that learning retrieval patterns parametrically bridges the gap between efficient inference and effective knowledge access, offering a practical alternative to both RAG and fine-tuning approaches.

Cite

Text

Wei et al. "MLP Memory: A Retriever-Pretrained Memory for Large Language Models." International Conference on Learning Representations, 2026.

Markdown

[Wei et al. "MLP Memory: A Retriever-Pretrained Memory for Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wei2026iclr-mlp/)

BibTeX

@inproceedings{wei2026iclr-mlp,
  title     = {{MLP Memory: A Retriever-Pretrained Memory for Large Language Models}},
  author    = {Wei, Rubin and Cao, Jiaqi and Wang, Jiarui and Kai, Jushi and Guo, Qipeng and Zhou, Bowen and Lin, Zhouhan},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wei2026iclr-mlp/}
}