Policy Learning with a Language Bottleneck
Abstract
Modern AI systems such as self-driving cars and game-playing agents achieve superhuman performance. But they often lack human-like generalization, interpretability, and inter- operability with human users. This paper introduces *Policy Learning with a Language Bottleneck* (PLLB), a framework enabling AI agents to generate linguistic rules that capture the high-level strategies underlying rewarding behaviors. PLLB alternates between a *rule generation* step guided by language models, and an *update* step where agents learn new policies guided by rules. Crucially, PLLB enables this kind of language-guided learning even when a natural language rule is insufficient to completely describe the target policy. Across five diverse tasks, including a two-player signaling game, maze navigation, image reconstruction, and robot grasp planning, we show that PLLB learns more interpretable and generalizable behaviors than standard policy learning methods. In three additional human subject studies, we show that show the learned rules significantly improve human task performance, enabling more effective human-AI coordination
Cite
Text
Srivastava et al. "Policy Learning with a Language Bottleneck." Transactions on Machine Learning Research, 2026.Markdown
[Srivastava et al. "Policy Learning with a Language Bottleneck." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/srivastava2026tmlr-policy/)BibTeX
@article{srivastava2026tmlr-policy,
title = {{Policy Learning with a Language Bottleneck}},
author = {Srivastava, Megha and Colas, Cédric and Sadigh, Dorsa and Andreas, Jacob},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/srivastava2026tmlr-policy/}
}