Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations

Abstract

This paper considers intuitively appealing axioms for rational, multi-objective agents and derives an impossibility from which one concludes that such agents must admit non-Markov reward representations. The axioms include the Von-Neumann Morgenstern axioms, Pareto indifference, and dynamic consistency. We tie this result to irrational procrastination behaviors observed in humans, and show how the impossibility can be resolved by adopting a non-Markov aggregation scheme. Our work highlights the importance of non-Markov rewards for reinforcement learning and outlines directions for future work.

Cite

Text

Pitis et al. "Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations." NeurIPS 2022 Workshops: MLSW, 2022.

Markdown

[Pitis et al. "Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations." NeurIPS 2022 Workshops: MLSW, 2022.](https://mlanthology.org/neuripsw/2022/pitis2022neuripsw-rational/)

BibTeX

@inproceedings{pitis2022neuripsw-rational,
  title     = {{Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations}},
  author    = {Pitis, Silviu and Bailey, Duncan and Ba, Jimmy},
  booktitle = {NeurIPS 2022 Workshops: MLSW},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/pitis2022neuripsw-rational/}
}