Neuro-Symbolic Language Modeling with Automaton-Augmented Retrieval
Abstract
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton – retrieval automaton – which approximates the datastore search, based on (1) clustering of entries into “states”, and (2) state transitions from previous entries. This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity, or alternatively saves up to 83% of the nearest neighbor searches over kNN-LM (Khandelwal et al., 2020), without hurting perplexity. Our code and trained models are available at https://github.com/neulab/retomaton . This is a workshop version of the longer paper that appeared in ICML'2022 (Alon et al., 2022).
Cite
Text
Alon et al. "Neuro-Symbolic Language Modeling with Automaton-Augmented Retrieval." ICML 2022 Workshops: KRLM, 2022.Markdown
[Alon et al. "Neuro-Symbolic Language Modeling with Automaton-Augmented Retrieval." ICML 2022 Workshops: KRLM, 2022.](https://mlanthology.org/icmlw/2022/alon2022icmlw-neurosymbolic/)BibTeX
@inproceedings{alon2022icmlw-neurosymbolic,
title = {{Neuro-Symbolic Language Modeling with Automaton-Augmented Retrieval}},
author = {Alon, Uri and Xu, Frank F. and He, Junxian and Sengupta, Sudipta and Roth, Dan and Neubig, Graham},
booktitle = {ICML 2022 Workshops: KRLM},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/alon2022icmlw-neurosymbolic/}
}