Evaluating Language Models Planning Capabilities on Goal Ordering Challenges

Abstract

Planning involves the composition of primitive actions to achieve specific goals within a given environment. Classical planning research has well-established different types of goal-ordering challenges which have implications on the planning heuristics. In this study, we investigate the performance of Large Language Models (LLMs) in identifying if an order between two goals hold. We distinguish between three types of goal orderings challenges: reasonable, necessary, and optimal. Our findings reveal that LLMs predominantly struggle with reasonable goal ordering tasks compared to necessary and optimal goal orderings. Advancing this area could lead to improvements in the planning abilities of LLMs.

Cite

Text

Hirsch et al. "Evaluating Language Models Planning Capabilities on Goal Ordering Challenges." NeurIPS 2024 Workshops: Compositional_Learning, 2024.

Markdown

[Hirsch et al. "Evaluating Language Models Planning Capabilities on Goal Ordering Challenges." NeurIPS 2024 Workshops: Compositional_Learning, 2024.](https://mlanthology.org/neuripsw/2024/hirsch2024neuripsw-evaluating/)

BibTeX

@inproceedings{hirsch2024neuripsw-evaluating,
  title     = {{Evaluating Language Models Planning Capabilities on Goal Ordering Challenges}},
  author    = {Hirsch, Eran and Uziel, Guy and Tavor, Ateret Anaby},
  booktitle = {NeurIPS 2024 Workshops: Compositional_Learning},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/hirsch2024neuripsw-evaluating/}
}