When Do Language Models Need to Be Large?

Abstract

Many leading language models (LMs) use high-intensity computational resources both during training and execution. This poses the challenge of lowering resource costs for deployment and faster execution in decision-making tasks among others. We introduce a novel plug \& play LM framework named Language OptimisingNetwork Distribution (LONDI). LONDI learns to selectively employ large LMs only where complex decision-making and reasoning are required while using low-resource LMs (i.e. LMs require less GPU usage, but may not be able to solve the problem alone) everywhere else. LONDI consists of a system of two (off-)policy networks, an LM, a large LM (LLM), and a reinforcement learning module that uses switching controls to quickly learn in which system states to call the LLM. We then introduce a variant of LONDI that maintains budget constraints on LLM calls and hence its resource usage. We test LONDI's performance in a range of tasks in ScienceWorld and BabyAI-Text and demonstrate that LONDI can solve tasks only solvable by resource-intensive LLMs while reducing GPU usage by up to 30\%.

Cite

Text

Chen et al. "When Do Language Models Need to Be Large?." ICML 2024 Workshops: FM-Wild, 2024.

Markdown

[Chen et al. "When Do Language Models Need to Be Large?." ICML 2024 Workshops: FM-Wild, 2024.](https://mlanthology.org/icmlw/2024/chen2024icmlw-language/)

BibTeX

@inproceedings{chen2024icmlw-language,
  title     = {{When Do Language Models Need to Be Large?}},
  author    = {Chen, Zhixun and Du, Yali and Mguni, David Henry},
  booktitle = {ICML 2024 Workshops: FM-Wild},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/chen2024icmlw-language/}
}