Algorithmic Language Models with Neurally Compiled Libraries
Abstract
Important reasoning tasks such as planning are fundamentally algorithmic, meaning that solving these tasks robustly requires inducing the underlying algorithms, rather than shortcuts. Large Language Models lack true algorithmic ability primarily because of the limitations of neural network optimization algorithms, their optimization data and optimization objective, but also due to the inexpressivity of the transformer architecture. To address this lack of algorithmic ability, our paper proposes augmenting LLMs with an internal reasoning module. This module contains a library of fundamental operations and sophisticated differentiable programs, so that common algorithms do not need to be learned from scratch. To accomplish this, we add memory, registers, basic operations, and adaptive recurrence to a billion-parameter scale transformer architecture built on LLaMA3.2. Then, we define a method for directly compiling algorithms into a differentiable starting library, which is used natively and propagates gradients for optimization. In this workshop paper, we study the feasibility of this augmentation by fine-tuning a small transformer on simple algorithmic tasks with variable computational depth.
Cite
Text
Saldyt and Kambhampati. "Algorithmic Language Models with Neurally Compiled Libraries." NeurIPS 2024 Workshops: Sys2-Reasoning, 2024.Markdown
[Saldyt and Kambhampati. "Algorithmic Language Models with Neurally Compiled Libraries." NeurIPS 2024 Workshops: Sys2-Reasoning, 2024.](https://mlanthology.org/neuripsw/2024/saldyt2024neuripsw-algorithmic/)BibTeX
@inproceedings{saldyt2024neuripsw-algorithmic,
title = {{Algorithmic Language Models with Neurally Compiled Libraries}},
author = {Saldyt, Lucas Paul and Kambhampati, Subbarao},
booktitle = {NeurIPS 2024 Workshops: Sys2-Reasoning},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/saldyt2024neuripsw-algorithmic/}
}