Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents

Abstract

LLM agents are rapidly improving at benchmarks designed to measure software development and reasoning ability. As the creation and improvement of such systems is itself a software development task, we are interested in this specific subset of ability. We describe the first steps towards a "meta-benchmark", in which a "top-level" agent aims to increase the performance of "reference agents" at tasks taken from a range of other benchmarks in the literature. We show that agents are able to alter other systems to a) better withstand prompt injection, b) fix non-trivial bugs in scaffolds, c) compare the performance of various LLMs at ``operating'' reference agents' scaffolds, and d) make progress towards model-editing tasks such as unlearning. We consider this a first step towards a broad and extensible meta-benchmark, which would allow us to measure AI self-improvement abilities.

Cite

Text

Brown et al. "Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents." NeurIPS 2024 Workshops: SoLaR, 2024.

Markdown

[Brown et al. "Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents." NeurIPS 2024 Workshops: SoLaR, 2024.](https://mlanthology.org/neuripsw/2024/brown2024neuripsw-autoenhance-a/)

BibTeX

@inproceedings{brown2024neuripsw-autoenhance-a,
  title     = {{Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents}},
  author    = {Brown, Samuel F. and Labib, Basil and Lugoj, Codruta and Y, Sai Sasank},
  booktitle = {NeurIPS 2024 Workshops: SoLaR},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/brown2024neuripsw-autoenhance-a/}
}