Meta-Prompt Optimization for LLM-Based Sequential Decision Making

Abstract

Large language models (LLMs) have recently been employed as agents to solve sequential decision-making tasks such as Bayesian optimization and multi-armed bandits (MAB). These works usually adopt an LLM for sequential action selection by providing it with a fixed, manually designed meta-prompt. However, numerous previous works have found that the prompt has a significant impact on the performance of the LLM, which calls for a method to automatically optimize the meta-prompt for LLM-based agents. Unfortunately, the non-stationarity in the reward observations during LLM-based sequential decision-making makes meta-prompt optimization highly challenging. To address this challenge, we draw inspirations from adversarial bandit algorithms, which are inherently capable of handling non-stationary reward observations. Building on this foundation, we propose our EXPonential-weight algorithm for prompt Optimization (EXPO) to automatically optimize the task description and meta-instruction in the meta-prompt for LLM-based agents. We also extend EXPO to additionally optimize the exemplars (i.e., history of interactions) in the meta-prompt to further enhance the performance, hence introducing our EXPO-ES algorithm. We use extensive experiments to show that our algorithms significantly improve the performance of LLM-based sequential decision-making.

Cite

Text

Kong et al. "Meta-Prompt Optimization for LLM-Based Sequential Decision Making." ICLR 2025 Workshops: LLM_Reason_and_Plan, 2025.

Markdown

[Kong et al. "Meta-Prompt Optimization for LLM-Based Sequential Decision Making." ICLR 2025 Workshops: LLM_Reason_and_Plan, 2025.](https://mlanthology.org/iclrw/2025/kong2025iclrw-metaprompt/)

BibTeX

@inproceedings{kong2025iclrw-metaprompt,
  title     = {{Meta-Prompt Optimization for LLM-Based Sequential Decision Making}},
  author    = {Kong, Mingze and Wang, Zhiyong and Shu, Yao and Dai, Zhongxiang},
  booktitle = {ICLR 2025 Workshops: LLM_Reason_and_Plan},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/kong2025iclrw-metaprompt/}
}