Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments

Abstract

We introduce the Path Divergence Objective (PDO), a novel model of boundedly-rational decision-making in stochastic, partially-observable environments. The PDO is derived from fundamental physical principles, including embodiment and the inherent costs of information processing. This framework enables us to model key features observed in real-world agent behavior, such as curiosity-driven exploration, novelty-seeking, and the intention-behavior gap. By adjusting a single parameter, the PDO can describe a continuous spectrum of decision-making strategies, ranging from highly irrational to perfectly rational. This flexibility makes the PDO applicable to a wide range of scenarios, including modeling biological organisms, simulating interactions between agents with varying degrees of bounded rationality, addressing AI alignment challenges, and designing AI systems that interact more effectively with humans.

Cite

Text

Gavenčiak et al. "Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments." NeurIPS 2024 Workshops: NeuroAI, 2024.

Markdown

[Gavenčiak et al. "Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments." NeurIPS 2024 Workshops: NeuroAI, 2024.](https://mlanthology.org/neuripsw/2024/gavenciak2024neuripsw-path/)

BibTeX

@inproceedings{gavenciak2024neuripsw-path,
  title     = {{Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments}},
  author    = {Gavenčiak, Tomáš and Hyland, David and Da Costa, Lancelot and Wooldridge, Michael J. and Kulveit, Jan},
  booktitle = {NeurIPS 2024 Workshops: NeuroAI},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/gavenciak2024neuripsw-path/}
}