KBLaM: Knowledge Base Augmented Language Model

Abstract

In this paper, we propose Knowledge Base augmented Language Model (KBLAM), a new method for augmenting Large Language Models (LLMs) with external knowledge. KBLAM works with a knowledge base (KB) constructed from a corpus of documents, transforming each piece of knowledge in the KB into continuous key-value vector pairs via pre-trained sentence encoders with linear adapters and integrating them into pre-trained LLMs via a specialized rectangular attention mechanism. Unlike Retrieval-Augmented Generation, KBLAM eliminates external retrieval modules, and unlike in-context learning, its computational overhead scales linearly with KB size rather than quadratically. Our approach enables integrating a large KB of more than 10K triples into an 8B pre-trained LLM of only 8K context window on one single A100 80GB GPU and allows for dynamic updates without model fine-tuning or retraining. Experiments demonstrate KBLAM’s effectiveness in various tasks, including question-answering and open-ended reasoning, while providing interpretable insights into its use of the augmented knowledge. Code and datasets are available at https://github.com/microsoft/KBLaM/

Cite

Text

Wang et al. "KBLaM: Knowledge Base Augmented Language Model." International Conference on Learning Representations, 2025.

Markdown

[Wang et al. "KBLaM: Knowledge Base Augmented Language Model." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/wang2025iclr-kblam/)

BibTeX

@inproceedings{wang2025iclr-kblam,
  title     = {{KBLaM: Knowledge Base Augmented Language Model}},
  author    = {Wang, Xi and Isazawa, Taketomo and Mikaelyan, Liana and Hensman, James},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/wang2025iclr-kblam/}
}