Language Models Are Weak Learners

Abstract

A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification, and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in many settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for some tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning models.

Cite

Text

Manikandan et al. "Language Models Are Weak Learners." ICML 2023 Workshops: ES-FoMO, 2023.

Markdown

[Manikandan et al. "Language Models Are Weak Learners." ICML 2023 Workshops: ES-FoMO, 2023.](https://mlanthology.org/icmlw/2023/manikandan2023icmlw-language/)

BibTeX

@inproceedings{manikandan2023icmlw-language,
  title     = {{Language Models Are Weak Learners}},
  author    = {Manikandan, Hariharan and Jiang, Yiding and Kolter, J Zico},
  booktitle = {ICML 2023 Workshops: ES-FoMO},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/manikandan2023icmlw-language/}
}